DOCOHENT BESOHE 



8D 209 J 36 

i 

I0THOR 
TITLE 



INSTITUTION 
SPONS AGENCY 

POB DATE 
COHTBACT 
BOTE 

BOBS PBICE 
DBSCRIPTOBS 



2fl 610 879 

Kingsbury, G. Gage; Seiss, David J. 
An Adaptive Testing strategy for flastery Decisions. 
Research Report 79-5. 

Hinnesota Oniv., Hinneapolis. Dept. or Psychology. 
Office of Naval Research, Arlington, va. Personnel 
and Training Research Prograns Offxce. 
Sep 79 

N00014-76-C-0627 
a9p. 

HP01/PC02 Plus Postage. 

Achievement Rating; *Achievenent Tests; Coapeteace; 
♦Cosputer Assisted Testing; *Efficiency; Latent irait 
Theory; *Hastery Tests; Military personnel; Prograa 
Effectiveness; Testing; Test Reliaoility 
IDENTIFIERS *Adaptive Testing 

' ABSTRACT 

v The theory and technology of itea characteristic 

curvd (ICC) response theory and adaptive testing were applied to 
judging individuals* coape*cacies against a prespecified aastery 
level to determine whether each individual is a "aastex" or 
"nonmaster" of a specified content donain. Iteas froa two 
conventionally administered aastery tests adainistered in a ailitary 
training environnent were calibrated using the unidimensional 
three-parameter logittic ICC model, osing response data froa the 
conventional Administration of the tests, a conputerized adaptive 
aastery testing (AHT) strategy was applied in a real-data simulation. 
The AHT procedure used ICC theory to transforn the traditional 
arbitrary * .roportion correct" aastery level to the ice achievement 
metric in order to allow adaptation of- the test to each trainee's 
achievement lev*l estimate, which was' calculated after each itea 
response. A yrete^Mecicion was aade for the trainee after the 35 
percent Bayejfian confidence interval around his achievement level 
estiaate failed to contain the prespecified aastery level. The AHT 
procedure reduced the average test length over ail circuastances 
examined, while reaching the same decision as the conventional 
procedure for 96 percent of the trainees. Advar+ages and possible 
applications of AHT procedures in certain classroom situations are 
noted and discussed. (Author/DtfH) 



* Reproductions supplied by EDRS are the best that can be aade * 

* froa the original docuaent. * 

ERIC 



O 



AN ADAPTIVE TESTING STRATEGY 
FOR MASTERY DECISIONS 



G. Gage Kingsbury 
and 

David J. Weiss 



U S DEPARTMENT OF EDUCATION 

NATIONAL INSTITUTE OF FOUCAT. 
E^UC I IUNAL RESOURCES INFORMATION 
CENTCR (ERICI 
This document has been reproduced as 
received from the person or organisation 
originating it 

Minor changes have been made to improve 
repioducnon quality 

• Points of v*?w or opiriiors stated in this docu 
ment do not net^ssanry repiesent official NlE 

position or policy 



Research Report 79-5 
September 1979 



PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 

iL n^g 2$ — 

TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)" 



Psychometric Methods Program 
Department of Psychology 
University of Minnesota 
Minneapolis, MN 55455 



0 

N 



This research was supported by funds from the Army Research 
Institute, Air Force Human Resources Laboratory, Defense Advanced 
Research Projects Agency, Navy Personnel Research and Development 

Center, and the Office of Naval Research, and 
monitored by the Office of Navf.l Research. 

Approved for public release; distribution unlimited. 
Reproduction in whole or in part is permitted for 
any purpose of the United States Government. 



ERiC 



Unclassified 



SECURITY CLAUDICATION OF THIS PAGE (Whon Osts Bntotod) 



REPORT DOCUMENTATION PAGE 



1 REPORT NUMBER 

Rese irch Report 79-5 



2 GOVT ACCESSION NO 



4 TITLE (and Subtitle) 

An Adaptive Testing Strategy for Mastery Decision 



READ INSTRUCTIONS 
BEFORE COMPLETING FORM 



3 RECIPIENT'S CATALOG NUMBER 



5 TYPE OF REPORT ft PERIOO COVEREO 

Technical Report 



• PERFORMING ORG, REPORT NUMBER 



* authors; 

G. Gage Kingsbury and David J. Weiss 



• CONTRACT OR GRANT NUMBERf*} 

N00014-76-C-0627 



* PFRFORMING ORGANIZATION NAME ANO AOORESS 

Department of Psychology 
University of Min ^esota 
Minneapolis, MN 55455 



tO. PROGRAM ELEMENT. PROJECT, TASK 
AREA ft WOR> UNIT NUMBERS 

P.E.: 61153N PROJ. :RR042-04 



T. A 
W.U. 



I CONTROLLING OFFICE NAME ANO AOORESS 

Personnel and Training Research Programs 
07 r ice of Naval Research 
At Ungton, VA 22217 



RR042-04-01 
NR150-389 



12 REPORT OATI 

September 1979 



*4 MONITORING AGENCY NAME ft AOORESSf*/ dlfloront from Controlling OWc») 



tl. NUMBER OF PAGES 

16_ 



IS- SECURITY CLASS, (ot thf roport) 

Unclassified 



PfCLASSIFlCATION/OOWNGRAOING 
SCNEOULE 



1* DISTRIBUTION STATEMENT (ot Mi« Roport) 



Approved for public release; distribution unlimited. Reproduction in whole or 
in part is permitted for any purpose of the United States Government. 



y DISTRIBUTION fTATEMENT (ot tho obotroct ontorodln Block 20, It dlt front from Roport) 



SUPPLEMENTARY NOTES 

This research was supported by funds from the Army Research Institute, Air 
Force Human Resources Laboratory, Defense Advanced Research Projects Agency, 
Navy Personnel Research and Development Center, and the Office of Naval 
Research, and monitored by the Office of Naval Research. 



it 



KEY WORDS (Continuo on roworoo oldo M nocooomy md Identity by block numbor) 



latent trait test theory 
item response theory 
response-contingent testing 
individualized testing 



achievement testing 
computerized testing 
adaptive testing 
sequential testing 
branched testing 



testing 

tailored testing 
programmed test ing 
automated testing 



20 ABSTRACT (Continuo on roworoo oldo II nocooomry md Idontlfy by block numbor) 

In an attempt to increase the efficiency of mastery testing while main- 
taining a high level of confidence for each mastery decision, the theory and 
technology of item characteristic curve (ICC) response theory (Lord & Novick, 
1968) and adaptive testing were applied to the problem of judging individuals' 
competencies against a prespecified mastery level to determine whether each 
individual is a "master" or a "nonm^ter" of a specified content domain. Itemsl 
from two conventionally administered classroom mastery tests administered in a 



DD S»T n 1473 



EDITION OF 1 N oV •* IS OBSOLETE 

S'N 0102-LF-014.6601 



Unclassified 



SECURITY CLASSIFICATION OF THIS PAGE (Whon Doto Bntorod) 



Unclassified 



SECURITY CLASSIFICATION Of THIl PAOC (Whmt Dmtm Enfrmd) 



military training environment were calibrated using the unidimensional three- 
parameter logistic ICC model. Then, using response data originally obtained 
from the conventional administration of the tests, a computerized adaptive 
mastery testing (AMT) strategy was applied in a real-data simulation. 

The ArtT procedure used ICC thecry to transform the arbitrary "proportion 
correct" mastery level used in traditional mastery testing to the ICC achieve- 
ment iretric in order to allow the adaptation of the test to each trainee's 
achievement level estimate, which was calculated after each item response. 
Adaptive testing continued until the 95% Bayesian confidence interval around 
the trainee's achievement level estimate failed to contain the prespecified 
mastery level. At that point testing was terminated, and a mastery decision 
was made for the trainee. 

Results obtained from the AMT procedure were compared to results obtained 
from the traditional mastery testing paradigm in terms of the reduction in 
mean test length, information characteristics, and the correspondence between 
decisions made by the two procedures for three different mastery levels and 
for each of the two tests. The AMT procedure reduced the average test length 
30% to 81% over all circumstances examined (with modal test length reductions 
of up to 92%) , while reaching the same decision as the conventional procedure 
for 96% of the trainees. 

Additional advantages and possible applications of AMT procedures in 
certain classroom situations are noted and discussed, and further research 
questions are suggested. 



9 

ERLC 



Unclassified 



SECURITY CLASSIFICATION Of THIS f>*0€(Whm Dmm Mnffd) 



CONTLWTJ 



Introduction n ^ 

Objectives 2 

The Adaptive Mastery Testing Procedure 2 

Mastery and the Achievement Metric 2 

Adaptive Item Selection and Scoung 4 

Item Selection 3 

Estimation of 0 5 

Bayesian Confidence Intervals: Making the Mastery Decision 6 

Illustration y 

Method < . . . . 8 

Subjects and Tests . .. t . 9 

Fitting the ICC Response Model . , 9 

Estimation of the Item Parameters 9 

Evaluating the Fit of the Model 10 

Simulation of AMT 11 

Comparison of Efficiency: AMT versus Conventional Testing ... 11 

Results 2^ 

Applicability of the ICC Model 13 

Factor Analysis 13 

Estimation of the ICC Parameters m [ a 13 

Conversion of the Mastery Level to the ICC Metric 16 

Test Length 16 

Total Group 

Mastery Groups m 19 

Nonmastery Groups 20 

High-Confidence Groups , 21 

Correspondence Between pecisions 22 

Information Functions 23 

Discussion and Conclusions 25 

Additional Advantages of the AMT Strategy 27 

References 29 

Appendix A: Illustration of MISS Procedure for Choosing Items for AMT ... 32 

Appendix B: Supplementary Tables ' 34 



ERLC 




Acknowledgments 



Test data utilized in this study were obtained from Air 
Force personnel enrolled in the Weapon Mechanics course 
at the Lowry Air Force Base Technical Training Center 
from 1977 to 1978. The authors extend their appreciation 
to Brian Waters and Larry Click of the Air Foice for 
making these data available and to Joel Brown for arrang- 
ing for the data to be transferred in a usable form. 

i 

Technical Editor: Barbara Leslie Caram 



6 



An Adaptive Testing Strategy for 
Mastery Decisions 



During the past 15 years, considerable interest in the psychological and 
educational measurement community has been directed toward the evaluation of 
student competency in various fields of study. In the simplest case, compe- 
tency in a field has been operationalized as some minimum skill level above 
which a student is declared a "master" and below which a student is declared 
a "nonmaster." Mastery testing has been developed as an implementation of the 
more general criterion-referenced test interpretation model formulated by 
Glaser and Klaus (1962) and expanded upon by many since then (e.g., Hambleton, 
Swaminathan, Algina, & Coulson, 1978; Popham, 1971; Popham & Husek, 1969). 

"Mastery" has typically been defined by subject matter experts as the min- 
imum percentage of items that a student should be able to answer from a given 
set of test items in order to be classified as proficient. Therefore, a stu- 
dent who correctly answered only the minimum acceptable percentage of items on 
a test of this type would be declared a master, and a student who correctly 
answered one item less would be decided a nonmaster in the subject matter area. 
So that all of the mastery decisions made would be comparable, mastery testing 
has traditionally required all students to answer the same set of test questions. 

This approach to mastery testing has several problems. First, a student 
whose test score is far above the specified cutoff score would be said to be 
a master of the subject matter; similarly, a student whose score was just bare- 
ly above the cutoff score would also be declared a master, but presumably that 
decision would be made with less confidence. Thus, classical mastery testing 
results in different levels of intuitive confidence for students whose raw 
scores fall at different distances above or below the cutoff, which results in 
decisions with different dependabilities for students with different raw scores. 

This problem has been discussed on the group level by Livingston (1972) in 
a study discussing the reliability of criterion-referenced tests as a function 
of the mean score level of the testee group. Hambleton and Novick (1973) and 
Davis and Diamond (1974) have specified methods to develop cutoff rules designed 
to yield certain desired ratios of false positive and false negative decisions 
through the use of the differential accuracy of decisions made at different raw 
score levels, but little research has been directed toward equalizing the con- 
fidence levels in decisions made by a mastery test across all levels of per- 
formance. Hambleton and Novick (1973) have suggested that the use of Bayesian 
point estimation of students' mastery scores might improve the accuracy of mas- 
tery decisions; it will be shown in this report that the use of Bayesian con- 
fidence interval estimates may be useful in equalizing the confidence in de- 
cisions made across all levels of observed performance. 

A second problem with the classical mastery testing paradigm is that each 
student tested is given the same set of test questions, even though the set of 
questions may be inappropriate for any reasonably precise measurement at some 



-2- 



achievoment levelh. In the mastery testing area, attempts have been made to 
adapt the test to each student Ve.g., Ferguson, 1970); but these attempts have 
almost universally assumed that all items administered were of equal quality. 
It is possible, through the use of item characteristic curve (ICC) response 
theory (Lord & Novick, 1968), to distinguish between items which yield differ- 
ent amounts of information concerning different trait levels. 

Several authors (e.g., Bejar, Weiss, i. tiialluca, 1977 ; McBride & Weiss, 
1976; Urry, 1977) have demonstrated that adaptive testing procedures using ICC 
response theory ca^ reduce test length with no reduction in measurement pre- 
cision. These testing procedures adapt the difficulty and information charac- 
teristics of each individual's test by drawing from large item pools items 
that are matched to the individuals estimated trait level. These results in- 
dicate that by making use of all of the information available about the test 
items and the individual's estimated achievement levels, the application of 
adaptive testing procedures using ICC response theory to a traditional mastery 
testing situation might result in a decrease in the test length needed to make 
confident decisions concerning each individual's mastery status. 



This report describes the design and application of an adaptive mastery 
testing strategy that eliminates these problems of the traditional mastery test- 
ing approach. The adaptive mastery testing strategy is designed to reduce the 
average test length for each student, wtiile equalizing the level of confidence 
in decisions made across the entire range of the achievement continuum. This 
report compares the performance of the conventional and adaptive mastery testing 
procedures within the context of one course of instruction in terms of effi- 
ciency, information characteristics, and level of correspondence between mas- 
ery decisions. 



The adaptive mastery testing (AMT) procedure is designed to administer 
achievement test items selected from a classical mastery test, but not all items 
are administered to each student. The test items administered to a given stu- 
dent are selected to provide the most information concerning the achievement 
level of that student. Mastery decisions are made with a specified degree of 
confidence for each student, using a cutoff point prespecified on the achievement 
continuum. 1 

There are three important components of the AMT procedure. The first in- 
volves converting the mastery level to the achievement metric. The second com- 
ponent is the item-selection technique used to determine which items should be 
administered to a specific student. The final component of the AMT strategy 
involves the manner in which the mastery decision is made and the degree of con- 
fidence that can be placed in the decision once it has been made. 

The classical mastery testing procedure specifies a percentage of the items 
on a test that must be correctly answered bv a student in order to be declared 
a master. I'sing ICC theorv, it is possible to generate an analogue to the "per- 
centage" cutoff of classical theory for use in adaptive testing. This is nec- 



ERLC 



3 



-.)- 



essary, since in an adaptive test each individual will tend to answer about 50% 
of the items correctly, given a Jarge enough item pool, because the items ad- 
ministered will be selected to be close to the individual's achievement level 
(Vale & Weiss, 1975; Weiss, 1973). The ICC analogue of proportion correct is 
based on the use of the test characteristic curve (TCC) , The TCC is the func- 
tion that relates the ICC achievement continuum to the expected proportion of 
correct answers that an individual at any achievement level may be expected to 
obtain if all of the items on tne test were administered. 

For this study the assumption was made that a three-parameter logistic 
ogive would describe the functional relationship between the latent trait 
(achievement) and the probability of observing a correct response to anv of the 
items on the test. This assumption yields a TCC of the following form: 



F(f|9) = V 



■+ (1 - ..) / exp[ 1.7 7,(^.-6) ] 
\expfl.7u.(i ... - G)]+i 




[1] 



where 



-'.(•' K) - 



0 



e 



the expected value of the proportion of correct answers observed 
on the test, given an achievement level; 

the estimate of the ICC discrimination parameter for item i; 

the estimate of the ICC difficulty parameter for item i; 

the estimate of the lower asymptote of the ICC for item i\ 

the number of items on the test; and 
a given achievement level. 



Thus, as Equation 1 indicates, the expected proportion correct at a given level 
of achieveme. * (?) is the average, over all items in the test, of the probabili- 
ty of a correc- response for each item, given the three ICC item parameters for 
each item and assuming a logistic ICC. 



This monotonically increasing function permits relating any achievement 
level to its most likely proportion correct or, more importantly in this con- 
text, determining the achievement level (e) which will most probably result in 
any given proportion of correct answers. An example of the use of the TCC in 
determining an achievement level that is comparable to a desired "percentage" 
cutoff is shown in Figure 1 using a hypothetical TCC. To determine a level of 
achievement that corresponds to, for example, a 70% mastery level on the test 
items whicl comprise the TCC, these steps would be followed: 

1. Draw a horizontal line (line A in Figure 1) from the '-.7 mark on 
the vortical (expected proportion correct, or •') axis of the TCC plot 
to the TCC. 

2. Drop a vertical line (line B) from the point of intersection of the 
TCC and the horizontal line drawn in Step 1 to the horizontal (achieve- 
ment level, or 0) axis. This point (e ) on the achievement level axis 
is designated the mastery level using the achievement metric. 



ERLC 



I) 



-4-. 

f 



Figure I 

Hypothetical Test Characteristic Curve Illustrating 
Conversion from a Proportion Correct Mastery Level 
to the Achievement Metric 



u 
<u 
u 
u 
o 
u 

g 

•H 

u 

o 
a 

o 



T3 

0) 

u 
a 




T r 

-4.0 -3.0 



Achievement Level (U) 



3. The cutoff point specified in Step 2 may now be used to make mastery 
decisions in place of the P=.7 mastery level originally specified. 
Once the mastery level is expressed in the achievement metric (8), 
rather than in terms of proportion correct, it is no longer necess- 
ary to administer all the items in the test to obtain an achievement 
level estimate for an individual — and a corresponding mastery de- 
cision. An achievement level estimate can then be obtained using any 
subset of items from the original test, provided that the individual's 
item responses are scored with a method that will put the achievement 
level estimate on the same metric as the TCC. Any ICC-based scoring 
procedure (Bejar & Weiss, 1979), in conjunction with the original item 
parameter estimates, will result in an achievement level estimate 
which will be on the 0 metric. 

This procedure allows conversion of any desired proportion correct mastery 
level to the 9 matric. Once this transfer is made, ICC theory and adaptive 
testing strategies may be used to increase the efficiency of mastery testing 
techniques. 

Adaptive Item Splrntion and Snoring 

To make mastery testing a mor- efficient process, the objectives of the 
AMT strategy were (1) to reduce the length of each student's test by elimi- 



9 

ERLC 



■ w 



-5- 



ERIC 



nating test items which provided little information concerning the student's 
achievement level and (2) to terminate the AMT procedure after enough infor- 
mation had been obtained so that the mastery decision could be made with a 
high degree of confidence. 

To operationalize the first objective, items were selected to be adminis- 
tered to student at each point during the testing procedure on the basis of the 
amount of information that the item provided concerning the student's achieve- 
ment level estimate at that point in testing. The administration of the test 
item which provides the most information concerning the student's present achieve- 
ment level estimate should provide the most efficient use of testing time. A 
procedure that selects and administers the most informative item at each point 
in an adaptive testing procedure was described by Brown and Weiss (1977), and 
this procedure was used in the present study. This procedure uses an adaptive 
maximum information search and selection (MISS) technique for the sequential 
selection of test items to be administered to each individual. 
t 

Item selection. The information that an item provides at each point along 
the achievement continuum can be determined from the ICC parameters of the item. 
Using the unidimensional three-parameter logistic ICC model (Birnbaum, 1968) to 
describe responses to the five-alternative multiple-choice items used in this 
study, the information available in any item is (Birnbaum, 1968, Equation 20.4.16) 



/.(6) - (l-^)D 2 a|* 2 [DL.(B)] f {^{DL^Q)] + o^ 2 [-DL. (9) ] } [2] 



where 



1.(8) - the information available from item i at any achievement level 0; 
on = the ICC discrimination parameter of the item; 
c i - the lower asymptote of the ICC for the item; 

D = 1.7, a scaling factor used to allow the logistic ICC to closely 
approximate a normal ogive; 
L.(Q) = a £ (8 - fc^), where b. is the ICC difficulty parameter of the item; 



the logistic probability density function; and 



¥ ■ the cumulative logistic function. 



provid.d an explanation and scoring programs for Oven's „ e thod. 



i; 



-6- 



Owen's 6 estimation procedure has been shown to yield biased estimates of 
trait levels (Kingsbury & Weiss, 1979; Lord, 1976; MrBride & Weiss, 1976). 
This bias may be attributed to the assumption of a normal distribution of 8 in 
the population made by Owen's procedure (Lord, 1976) and/or to inappropriate 
prior information concerning 6 on the individual level (Kingsbury & Weiss, 1979). 
The bias inherent in this scoring method may render the MISS technique less 
efficient than it would be under optimal conditions, and thereby may reduce the 
efficiency of the AMT technique as a whole. 

To use MISS under optimal conditions, G estimates should be obtained through 
, the use of a maximum likelihood estimation technique, vhich yields asymptotical- 
ly efficient estimates (Birnbaum, 1968). Maximum likelihood 6 estimation tech- 
niques are not able, however, to obtain trait level estimates for consistent 
item response patterns (either all correct or all incorrect responses) or for 
item response patterns for which the likelihood function is extremely fKt. 
Owen's Bayesian scoring method will yield an estimate for any response pattern. 
The inability of the maximum likelihood procedures to estimate 0 for some re- 
sponse patterns mitigates against the use of a maximum likelihood estimation 
procedure in this situation, since it would be necessary to assign arbitrary 9 
estimates during the early stages of item selection and scoring. Thus, the 
Bayesian scoring procedure was used in order to obtain 0 estimates for each 
student after each item administered by the adaptive testing procedure, even 
though some efficiency might have been lost in the AMT due to the bias inherent 
in the estimation procedure. Use of the Bayesian 9 estimation procedure in 
this study also allowed the use of easily interpretable Bayesian confidence in- 
tervals to make the mastery decision. 

Bayesian Confidence Interna Is: Making the Mastery Decision 

Any achievement level estimate (0) obtained using ICC-baged scoring of any 
subset of the items 'from the original test and their ICC item parameters wi'.l 
be on the same metric as the TCC for the original test. Thi^ allows immediate 
comparison between any achievement lev:l estimate (6) and any point on the 
achievement metric (e.g., 0 ). However, two different subsets of items may re- 
sult in achievement level estimates that are not equally informative. For ex- 
ample, if one test consisted of many items that were too easy for a given indi- 
vidual and the other used the same number of equally discriminating items at 
about thi appropriate difficulty level for that individual, the second test 
would yield a much more accurate achievement level e.stitnate for that individual. 
Achievement level estimates that are on the same metric are comparable if their 
differential precision is taken into account. To do this, confidence interval 
estimates for the 6's should be compared instead of tfte point estimates (0). 
For this reason, the AMT strategy makes mastery decisions with the use of Bayes- 
ian confidence intervals. * 

After each item was selected using MISS and administered to a student, -a- * 
point estimate of the student's achievement level (6) was determined using 
Owen's Bayesian scoring algorithm and the responses obtained from all items 
previously administered. Given this point estimate and the corresponding var- 
iant estimate for the 9, also obtained using Owens' procedure (see Brown & 
Weiss, 1977, Equations 3 and 5, pp. 4-5), a Bayesian confidence interval may 
be defined such Lhat: 

P. - 1 96( o* )** - 6 - 0/ + 1 . 96(3^, with /' - .95, , [3] 

ERIC o 12 



— 



where 



), * the Bayesian point estimate of achievement level c^lculat-d follow- 
ing item i, 



' 2 

0^ - the Bayesian posterior variance estimate following item i , 
9 * the true achievement level. 



and 



This statement may be interpreted as meaning that the probability that the true 
value of the achievement level parameter, 9, is within the bounds of the confi- 
dence interval is .95. Alternatively, it might also be concluded with 95% con- 
fidence that the true parameter value (9) lies within the confidence interval. 
Confidence intervals at differing confidence levels can be constructed using 
appropriate 2-values from a normal distribution in place of the 1.96 in Equa- 
tion 3. 

After this confidence interval has been generated, it can be determined 
whether or not 8^, the achievement level earlier designated as the mastery lev- 
el using the TCC (see Figure 1), falls outside the limits' of the confidence 
interval. If it does not, another item is administered; to the student, and the 
confidence interval is recalculated using the updated 9 and its, updated vari- 
ance. This procedure continues until, after some item has been administered, 
the coojy.dence interval calculated does not include 0 the mastery level on 

the achievement continuum. At this point testing is terminated, and a mastery 
decision is made. If the lower limit of the confidence interval falls above 
the specif^d mastery level, 9^, the student is declared a master. If, on the 

other hand, the upper limit of the confidence interval falls below 9 , the stu- 
ff? 

dent is declared a nonmaster. Given a finite size iter, pool, the testing pro- 
cedure may, in some cases, exhaust the item pool before a decision can be made. 
This will occur for students with 9 values close to 9^. It is possible to make 

a mastery decision for these students based, simply on whether the Bayesian point 

estimate of their Mevement level (9) is above or below 8 . However, for 

these students, mastery decisions will not be made with the same confidence lev- 
els as those made for studeni.s for whom the confidence interval falls completely 
above or below 9„, 



ees 



ERIC 



Figure 2 shows the result of the AMT procedure for two hypothetical test- 
A and B. Achievement level point estimates (6) and error bands, which 
indicate the appropriate Bayesian confidence intervals, are shown fb- each 
testee after each item was administered. An aroit^ary mastery levei, > ^ .50, 

m 

was chosen fo. this example; normally, however, the mastery level would be de- 
termined by the TCC transformation of an existent proportion correct mastery ' 
criterion. 

For Testee A, the first 0 estimate was below 8 mf but the confidence inter- 
val around this estimate contained 6^. Thus, the 6 estimate was not precise 
enough to make a confident decision; consequently, testing continued for Test- 
ee A. After each item was administered, a new 0 estimate and a corresponding 
confidence interval were calculated. For the first 6 items administered to 

1 n 



7 



-8- 



Testee A, the confidence interval around the 6 estimate contained 9^, and test- 
ing continued. After the administration of the 7 th item, the entire confidence 
interval around the 6 estimate for Testee A was above G m . This implied that 
the 9 estimate was precise enough to allow a confident decision to be made for 
Testee A. Testee A was declared a master at this point, and testing was termi- 
nated. 



Figure 2 

example of the AMT Procedure: Achievement Level Point Estimates 
Bayesian Confidence Intervals aftar Each Item Administered to 
Two Hypothetical Testeei, Testee A and B 



and 



> 



c 

> 



< 

u 

03 

B 




♦ ^enlevement level estimate 
mm Hivrilan confidence Intervi 



t B J 
1 



Ac h ievt-'tnf't level eitlaate 
Bav<rsldn icnttdente Interval 



Number of Items Administered 

For Testee B, the same type of procedure was followed. For the first 13 
items 'administered to Testee B. the confidence interval around 9 contained 0 m . 
The lbth item administered to Testee B resulted in a 0 and confidence interval 
which fell completely below 9^. At that point, testing was terminated and Tes- 
cee B was declared a nonmaster of the' subject area. 

It should be noticed t Testee A had a final estimate (9-1.9) that was 
-each closer to the mast^ i than the final 6 estimate for Testee B 

(9~ -.30). Therefore, pvl.. ^re precise measurement was needed for Testee B 
than for Testee A to make mastery decisions with comparable confidence levels, 
and several morfe items were administered to Testee B than to Testee A, to ob- 
tain the additional precision needed in order to make the mastery decision. 



The AMT strategy was evaluated using real-data simulation (Weiss/ 1973). 
In this approach, test item response data obtained from the administration of 
a conventional paper-and-pencil multiple-choice achievement test were used to 
simulate the administration of the AMT strategy. That is, items were selected 



-9- 



by the AMT strategy for' each student from the conventional test already adminis- 
tered. Item responses obtained in the conventional test were used by the AMT 
strategy and scored as described above. If a mastery decision could not be 
made after a given item was used, another item from the conventional test was 
selected by the MISS approach, and the previously obtained item response was ' 
used by the AMT strategy. This procedure was continued until the AMT strategy 
could make a mastery decision or until all items in the conventional test pool 
had been administered. 

Subjects and Tests 

Item response data were obtained from trainees undergoing the Weapon Me- 
chanics course at the Lowry Air Force Base Technical Training Center during 
1977 and 1978. This course is computer-managed, and trainees proceed at their 
own pace through 13 well-specified blocks of instruction. During each block, 
several tests are given from which mastery decisiona^are made. Trainees are 
given several attempts to pass each test in each block. 

For this study two block tests of different lengths were "arbitrarily chosen 
to investigate the properties of the AMT procedure. Specifically, data used 
were the item responses of 200 trainees to the first test in the first block of 
instruction (Test 11) and the item responses of 200 trainees to the first test 
in the third block of instruction (Test 31). These tests consisted of 30 and 50 
conventionally administered 5-alternative multiple-choice items,- respectively. 
Only the trainees' performances in their first attempt to pass the tests were 
used for this study. 

Fitting the ICC Response Model 

Estimation of item parameters. The procedure used for the estimation of 
the three item parameters of the logistic ICC response model was developed by 
Urry (x976). This procedure obtains initial estimates for the discrimination 
(a) and the difficulty (b) parameters for an item through the use of a direct 
conversion of the classical item parameters and the individuals' raw scores 
(number correct). A value of the lower asymptote parameter (<?) is found which 
minimizes a X goodness-of-f it statistic for the item. These initial values 
are made more precise through the use of an ancillary correction procedure 
(Fisher, 1950). To obtain more precise estimates of the parameters, the entire 
procedure is repeated replacing the individuals' raw scores with Bayesian modal 
estimates (Samejima, 1969) of their achievement levels. 

Urry's item parameterization method excludes items which meet any 0 f the 
following rejection criteria during the first stage of the procedure: 

1. a less than .80, 

2. b less than -4.00 or greater than 4.00, and 

3. greater than .30. 

If an item is excluded on the basis of one of .hese criteria during the initial 
stage of the parameterization procedure, it leceives no parameter estimates in 
either stage of the procedure. These restr ;ive criteria are removed after 
the first phase of tho calibration, and no further culling of the items is done 
Thus, the final values 'of the parameter estimates for those items which survive 
the first phase are not' constrained b? the rejection criteria. 



ERIC 



-10- 



ft' M xluaz i nj *k* / ; f f'ie nodtl ■ To examine the usefulness and appropriate- 
ness of the unidimensional three-parameter logistic ICC model with data of the 
type provided by the Weapon Mechanics course, rwo questions were investigated: 

1. Does factor analysis of the intercc t re lat ions between item responses 
result in only a single common n.ctor? That is, is the use ,of a uni- 
dimensional mouel justifi by the presence of only a single nonraudom 
d imension? 

2. Do parameter estimates obtained from these data correspond to the 
range of parameter estimates obtained in previous studies that have 
shown this type of model to be useful in increasing testing efficiency? 

To answet the first question, principal axis factor analyses were performed 
separately on the data fron Test 11 and Test 31- Matrices of item tntercorre- 
lations (phi coefficients) were calculated from the raw item-response data for 
the 200 trainees on each of the tests using the PEARSON CORR computer subroutine 
from the S:>itis ti sjlI m F^kag*. f hx. i>c :til ^'irrutos (^F^'S; Nie, Hull, Jenkins, 

Steinbrenner , & Bent, 1970). 

The resultant 30 a 30 (Test 11) and 50 x 50 (Test 31) item intercorrela t ion 
matrices were each factor analyzed by the iterative principal axis factor analy- 
sis subroutine from ±<tS.\ The initial communal ity estimate for each of the items 
was the squared multiple correlation of the item with all other items in the 
tePt. Ihe analysis iterated until successive communality estimates differed by 
a negligible amount. 

To determine the amount of random variation in the final factor-analytic 
solutions, parallel analyses were conducted following the suggestion of Horn 
(1965). This entailed factor analyses of sets of random data that were gener- 
ated to parallel the origi' -1 data, using the same number of "items" and "sub- 
jects. M Eigenvalues obta d for factors in the random data were used to de- 
termine whether factors ol;t ined from the analysis of the real data were "true" 
factors or residual fetors. If the eigenvalue of a factor obtained from the 
real data was larger than that for the corresponding random-data factor, the 
real-data factor was considered to be a true factor; but if the eigenvalue was 
similar to that obtained from the random-data factor, then the real-data factor 
was considered to be a residual factor of no real importance. 

To answer the second question posed above, the parameter estimates ob- 
tained for these two tests were compared to the estimates obtained in two other 
studies (Bejar, Weiss, ^ Kingsbury, 1977; Brown & Weiss, 1977) that used a uni- 
dimensional three-parameter logistic ICC model to attempt to improve testing 
accuracy in achievement testing situations. Further comparisons were made be- 
tween the parameter estimates obtained from the present data and the guidelines 
expressed by Urry (1977) to indicate whether the use of an adaptive testing item 
pool will improve the quality or efficiency of trait measurement. Urry's guide- 
lines are as follows: 

1. The ; parameter estimates of the items in the pool should exceed .80. 

2. The . ; parameter esti nates should be widelv and evenlv distributed be- 
tween -2.00 a» 1 +2.0'). 

3. The ' parameter est .mate? should be l^ss than .30. 

To the extent that pirameter estimates obtained from Tests 11 and 31 followed 
Urrv's guidelines ani showed cl^se correspondence to other item pools that have 



-11- 



proven to be useful in adaptive testing, it could be concluded that the items 
used, in this study would show some usefulness with the unidimensional three- 
parameter ICC model. 

circulation of AW 

^ In order to simulate the AMT strategy, a computer program was designed to 
administer" the one item in v the item pocl (which included all of the items 
from the conventional test not rejected by the calibration procedure) providing 
the most information at a trainee's current level of 9. Each trainee began the 
;est wUh 8 of 0.0 and a prior variance of 1.0. The trainee's response taken 
from his/her original responses to the conventional test was used by the Bayes- 
ian scoring routine to produce a new 8 estimate. Then the item with the most 
information at this new 6 was chosen to be administered next. (No item was ad- 
ministered more than once to a trainee.) A new 8 estimate was found using the 
trainee's response to this item, and then another item was chosen based on the 
new 0 estimate. 

The program continued to choose items to be administered until the trainee's 
8 was shown to be either above or below a given mastery level, 8 , with a pre- 

m r 

specified degree of confidence. A 95% Bayesian symmetric confidence interval 
was calculated around the trainee's 0 after each item was administered.^ The 
AMT strategy continued urtil this confidence interval failed to include the pre- 
specified mastery level; when this occurred, the AMT procedure was terminated. 
A lower limit of three items was set for the length of the AMT to avoid anomalous 
results that might occur from making mastery decisions based on a small number 
of item responses. For trainees for whom a mastery decision could not be made 
with the AMT procedure before all items were administered, mastery was deter- 
mined by whether the final 8 was above or below 8 . 

During the simulation, three different mastery levels were used correspond- 
ing to proportion correct mastery levels of P=.7, .8, and .9. These mastery 
levels were calculated from the TCC for each test, as described above. To max- 
imize the comparability between the conventional and adaptive mastery testing 
strategies, the conventional test was truncated to include only the items which 
were not rejected by the calibration procedure. In addition, the conventional 
test was scored by Owen's Bayesian scoring method, and the same mastery levels 
were used for both testing strategies. 

Com parison of Effi^'moy: AMT versus Convent torn I Testing 

It the AMT strategy were a more efficient testing procedure than the con- 
ventional mastery testing procedure, it would reduce test length while adminis- 
tering items with high enough information to maintain a very high correlation 
between decisions made by the AMT and the conventional approach. Consequently, 
to determine whether the AMT procedure reduced the number of items given to 
trainees without reducing the quality of the mastery decisions made for those 
trainees, three criteria were evaluated separately for Test 11 and Test 31 for 
the AMT and conventional testing procedures at each of the mas* 2ry levels: 

' 1. The mean number of items administered to trainees, 

2. The mean information obtained after all itemr, weie administered, and 

3. Relationships between mastery decisions made at the termination of 
the testing by the AMT and conventional procedures. 



-12- 



Figure 3 

Eigenvalues of the First 10 Common Factors Extracted From Item 
Intercorrelations for Test 11 and Test 31 and for Parallel Random-Data Factors 



(a) Test 11 



7.0- 

6.0- 

5.0- 

5 4.0- 
c 

0) 

£ 3.0- 
2.0- 

i 

1.0- 



• Test 11 

• Random Daca 




1 

5 

Factor 



—I — 

10 



(b) Test 31 



0) 
0 

> 

c 

v 

•H 
[Jj 



10.0- 
9.0- 
8.0- 
7.0 - 
6.0 - 
5.0- 
4.0- 
3.0- 
2.0- 
1.0- 



- Test 31 
Random Data 




10 



Factor 



-13- 



In addition, to examine the characteristics of the two testing procedures more 
closely, the mean information obtained from each procedure was plotted for each 
testing strategy as a function of the achievement level estimate for each mas- 
tery level. 

R>.>sd*.: 

Audi ca b ility of the ICr Model 

Fa ctor analysis . Eigenvalues of the first 10 factors extracted from 
item intercorrelations for Test 11 and Test 31 and the random data parallel 
analysis for each test are shown in Appendix Table B-l; these values are plotted 
in Figure 3. Tor Test 11 (Figure 3a) the first three factors had higher eigen- 
values than their corresponding random-data factors. However, only the first 
factor differed substantially from the corresponding random-data factor. Thus, 
for Test 11 it was not unreasonable to infer that only the first factor was a 

true" factor underlying trainees' responses, since the eigenvalues of the 
other factors resembled those of the random factors and the first factor account- 
ed for more than three times the amount of common variance th<an any other factor. 

For Test 31 (Figure 3b) the eigenvalues of the first five factors extracted 
each exceeded the eigenvalues of their corresponding random factor, but only 
the first two factors exceeded the random-data values by a substantial amount. 
The first factor accounted for 20-5% of the common variance extracted uy the 
10-factor solution, and the second factor accounted for 6.2% of the common 
variance. No other factor accounted for more than 5% of the variance. These 
data indicate that there were probably two real factors underlying trainees' 
responses to Test 31. This two-factor solution might indicate that a multi- 
dimensional latent trait model should be postulated to explain trainees' re- 
sponses to Test 31. However, because the first factor accounted for over three 
times as much variance as the second factor, the unidimensional model could 
still be used; data presented by Reckase (1978) indicate that if a dominant 
first factor exists, items calibrated using a unidimensional model will adequate- 
ly measure that first factor. 

Estimation of the ICC parameters. Tables 1 and 2 show the ICC parameter 
estimates obtained for each of the items in Test 11 and Test 31, respectively. 
Of the items in the conventional test, 17% (5 items) from Test 11 were rejected 
by the parameterization procedure, while 24% (12 items) were rejected for Test 31, 
These losses are comparable to losses observed during other investigations of 
achievement tests using this parameterization procedure; Bejar, Weiss, and 
Kingsbury (1977) lost 22% of their total pool during item parameterization, and 
Brown and Weiss (1977) lost 13% of their total pool. 

For Test 11, values of the a parameter estimates ranged from .63 to 4.69, 
with a mean of 1.48 and a standard deviation of .98. Values of the b parameter 
estimates ranged from -2.35 to 1.32', witha mean of -.98 and a standard devia- 
tion of 1.01. Values cf the c parameter estimates ranged from .00 to .49, with 
a mean of .27 and a standard deviation of v - 1 38 . 

For Test 31, values of estimates of the jt parameter ranged from .63 to 
3.42, with a mean of 1.16 and a standard deviation of .65. Values of the b para- 
meter estimates were from -1.86 to 3.18, with a mean of -.58 and a standard 
deviation of 1.08. The c parameter estimates ranged from .00 to .77, with a 



ERIC 1<J 



-14- 



Table 1 

ICC Item Parameter Estimates for the Items in Test 1 1 

a b c ' ~ "~ 

Item Number Discrimination Di fficulty L ower Asy mptote 



1 

1 


a 






Z 
T 


Q 1 

. cl 


q q 
- . oo 


. 22 


J 

A 


Q O 


-i . JO 




J 


. DO 


1 HA 

-1 . Do 


7 1 


A 
0 




1 1 Q 

-1 . lo 


7£ 
. JO 


7 


z . / j 


-1 Qft 
' —X . 7 0 


1 7 
. 1 z 


ft 
O 


1 77 


ft 1 

. ol 


/. Q 


Q 

j 




7 A 
. ZD 


Aft 
• '♦O 


i n 








1 1 

X X 








x. Z 


A 1 

. 0 J 




7Q 
. Z7 


1 1 


1 1ft 

X . JO 


— 1 A A 
—l . 0<* 


7 1 




l 7n 
1 ■ / u 


1 m 
-1 . Ul 


. 37 


1 S 

X J 


1 1 7 


-1 Al 
— 1 . 01 


. Z J 


X 0 


A 7 
. 0 / 


1 on 
-1 . 9U 


7 Q 
. Z7 


1 7 
1 / 


1 A A 
1 . <*0 


7 A 


. z / 


1 ft 
X o 


. / J 


Q 7 


1 7 
. 1Z 


1 Q 

X 7 


A ^ 


1 OA 

-1 . Zh 


. 20 


Z U 


1 Oft 


— 1 7 1 
-1 - /I 


. JO 


zl 


4 . 69 


. 98 


0 


22 


2.16 


-1.51 


16 


23 


?.16 


-1.55 


.19 


24 


1.32 


.56 


.30 


25 


1.21 


-1.54 


. 36 


26 








27 


3.58 


-2.35 




28 


1.04 


1.32 


.46 


29 


.83 


-1.69 


.09 


30 


1.31 


-.46 


.43 



"Mi ssing values indicate that the item was rejected by the para- 
meter estimation procedure. 

mean of .28 and a standard deviation of .16. For both of these tests the para- 
meter estimates obtained were well within the range established by two earlier 
studies that examined achievement tests using the same item parameterization 
method (Bejar, Weiss, & Kingsbury, 1977; Brown & Weiss, 1977). 

Examination of the item parameter estimates obtained from Test 11 and 
Test 31, using Urry f s guidelines for a good adaptive testing item pool, indicated 
the following: 

1. For both Test 11 and Test 31, 76% of the items had a values exceeding 
.80, while the average value for both tests exceeded 1.00. 

2. The b values wore fairly widely and evenly distributed between -2.0 
and 1.0, but the distribution was rather sparse above 1.0. Consider- 
ing the small numbers of items in the two item pools, the distribution 
of the b values seems appropriate, though the pools might have been 



ERIC 



-15- 



Table 2 

ICC Item^Parameter Estimates for the It in Test 3i 



Item Number 

1 

2 
3 
4 
5 
6 
7 
8 
9 
10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

2A 

25 

26 

27 

28 

29 

30* 

31 

32 

33 

3A 

35 

36 

37 

38 ' 

39 
AO 
41 
A2 
A3 
AA 
A5 
A6 
A7 
A8 
A9 
50 



Discrimination 



.70 
3.39 
1.95 
.88 
.65 
.71 
.81 
.66 

1.18 

.95 
2.55 



.94 
1.13 
.92 
1.03 
.79 
.80 
1.01 
.80 
. 79 



1.05 
.95 
1.11 
1.5A 
.73 
.63 

.95 
1.20 
1.07 
3.A2 



1.0A 
1.18 
1.03 

.7A 
1.08 
1.02 

.83 
1.70 



Pif fic ul ty_ 



-1 



1 . AO 
L.86 
18 
78 
.82 
.68 
1.85 
J .8A 

-.74 



-.90 
-1. 39 

- .44 
-1 .43 
-.46 
-.49 
.26 
-1.04 
-.65 
-1.11 
.98 

.09 
-.23 
-1 .64 
-1.56 
-.44 
-1.54 

.40 
1.13 
.45 
-1.74 



-.77 
-.49 
-.97 
-1.83 
-.56 
.80 
.29 
1 .06 



Lower Asympt ote 

.33 

.77 
.37 
.14 
.39 
.38 
.35 

.37 

.36 
.01 

.13 
.23 
.38 
.13 
.16 
.35 
.15 
.19 
.27 

.41 
.39 
.20 
.14 
.11 
.06 

.17 
.45 
.27 



.37 
.39 
.36 
.21 
.38 
.37 
.42 
.33 



Missing values indicate that the item was rejected by the parameter 
estimation procedure. 



ERiC 



-16- 



slightly too easy to meet Urry's second guideline. However, Urry's 
guidelines were proposed for ability tests for which it is desired to 
measure precisely across a wide range of ability, whereas the data of 
this study were from a mastery achievement test for which it was de- 
sired to classify students on either side of a mastery level. Thus, 
the distribution of b values would not be expected to conform with 
Urry ' s second recommendation. 
3. Fifty-six percent of the items in Test 11 and 47% of the items in 
Test 31 obtained c estimates below .30. The average c estimate for 
each test was less than .30. 

Thus, in light of Urry's guidelines and the earlier studies, examination of the 
item parameters obtained indicated that the parameter estimates obtained from 
Test 11 and Test 31 rere similar to thv,3e obtained for items which had previous- 
ly been used to improve achievement measurement; consequently, the items were 
appropriate for investigating the AMT strategy. 

Conversion of the Mastery Level to the IC r - Metric 

The ICC item parameter estimates for each test were used in Equation 1 to 
obtain the TCC for each test. Figure 4 shows the resulting TCC for Test 11 
(Figure 4a), using item parameters for the 25 items that survived the cali- 
bration procedure, and for Test 31 (Figure 4b), based on the 38 items for which 
parameter estimates were available on that test. Conversion of the proportion 
correct mastery levels (P*.7, .8, and .9) to the achievement metric (6) are 
also shown. 

Test 11 had a slightly steeper TCC than did Test 31, reflecting the higher 
average discrimination of its items. The lower average b level of the Test 11 
items (i.e., easier items) is reflected in the fact that the TCC for Test 11 is 
shifted to the left along the achievement level, or 8, axis in comparison to 
Test 31. The relatively equal average a parameters for the two tests are re- 
flected in the values of the TCC at 6=-4.0, 

For Test 11 the P=>7 mastery level was converted to 8=-. 90 on the achieve- 
ment metric, the P=.8 mastery level was converted to 0»-.23, and the P*.9 mas- 
tery level was converted to 6= . 75 . For Test 31 the P*.7 mastery level was con- 
verted to 6=-.48; the P*.8 level, to 6=.12; and the P».9 level, to 6=.91 on the 
achievement metric. It can be seen that for both tests the conversion was non- 
linear, reflecting the gain in potential discriminability resulting from con- 
sideration of the unique operating characteristics of each item. 

Test Length 

Table 3 shows the mean number of items, the average amount of information 
obtained from each item administered, and the number of individuals from vari- 
ous subsamples under the AMT and conventional strategies at each of the three 
different mastery levels. The four subgroups for which these data are presented 
are (1) the total group of trainees, (2) the groups of trainees declared masters 
by the relevant testing procedure, (3) the groups of trainees declared nonmas- i 
ters by the relevant testing procedure, anJ (4) the groups of trainees for which 
the AMT procedure made decisions with full confidence (i.e., trainees for whom 
the mastery level. 8 , fell outside the 95% confidence interval at some point 

during the test and terminated the AMT procedure). Frequency distributions of 



-17- 



Figure 4 

Test Characteristic Curves for Test 11 and Test 31 , with Conversion 
of Three Mastery Levels (P«.7, .8, and. 9) from the Proportion- 
Correct Metric to the Achievement Metric 

(a) Test 11 



l.Oi 




a, 



4J 

u 

0) 

u 
u 

0 

u 

c 
o 



u 

o 
a 
o 
u 
cu 

X> 

0) 

u 

u 
<u 
a 

X 

w 



1.0 -i 



Estimated Achievement Level (0) 
(b) Test 31 




-1.0 0 1.0 
Estimated Achievement Level (6) 



ERIC 



-18- 



mimbers of iteme administered for each of these subgroups are in Appendix Table 
B-2 for Test 11 and Appendix Table B-3 for Test 31. 



Table 3 

Sample Size (/V) , Mean Test Length (L) , and Mean Information Per Item (I) 

for AMT and Conventional (Conv) Test for Tests 11 and 31 at 
Thr e e Mastery Levels for Total Group and Three Subgroups 



Test, 
Mastery 

Level, Group 



anr* 

To cf ^nrt 

icSi ing 
ol ra tegy 




Total 




Masterv 












High 
Confidence 




[j 




7 
J 


N 


L 




T 

I 


IV 


r 

U 




1 


N 




j 


To o t- 11 

lest ii 
































































Conv 


1 QQ 
177 


25 




9Q 


172 


25 




9 ft 


9 7 


9 ^ 






154 


25 


30 


AMT 


199 


12 


8 


.32 


174 


12 


3 


.28 


25 


16 


4 


.52 


154 


9.2 . 


36 


P-.8 
































Conv 


199 


25 




.29 


135 


25 




.29 


64 


25 




.30 


100 


25 


40 


AMT 


199 


17 


4 


.29 


126 


17 


4 


.28 


73 


17 


4 


.32 


100 


9.9 


55 


































Conv 


199 


25 




.29 


43 


25 




.50 


156 


25 




.23 


132 


25 


27 


AMT 


199 


13 


1 


.34 


34 


20 


8 


.50 


165 


11 


5 


.27 


132 


7.0 . 


38 


Test 31 
































P-. 7 
































Conv 


200 


38 




.22 


127 


38 




.18 


73 


38 




.28 


122 


38 


21 


AMT 


200 


21. 


8 


.30 


134 


19 


4 


.26 


68 


26 


4 


.36 


122 


11.4 . 


44 


P*.8 
































Conv 


200 


38 




.22 


73 


38 




.15 


127 


38 




.26 


117 


38 


24 


AMT 


200 


23. 


4 


.26 


74 


27. 


7 


.20 


126 


20 


9 


.32 


117 


13. r . 


41 


P-.9 
































Conv 


200 


38 




.22 


27 


38 




.12 


173 


38 




.23 


151 


38 


24 


AMT 


200 


14. 


7 


.23 


28 


38 




.12 


172 


10. 


9 


.30 


151 


7.2 . 


40 



Total group. For the total group of trainees responding to Test 11, the 
AMT procedure reduced the average number of items administered (L) substan- 
tially at every mastery level. The minimum reduction in number of items admin- 
istered that was noted was for the P-.8 mastery level, where test length for 
the conventional test was 25 items, compared to a mean test length for the AMT 
procedure of L=17.4 items; this reduction of 7.6 items represents a minimum 
test length reduction of 30.4% of the conventional test length. The maximum 
test length reduction was 48.8% of the conventional test (12.2 items) when a 
mastery level of P-,7 was used._ For the same group of trainees, a gain in the 
average amount of information (J) obtained from each item administered was 
noted for the AMT procedure at the P*.7 and P-.9 mastery levels. The gains in 
information per item administered were .03 information units (IU),. or a 10% 
increase at the P*.7 mastery level, and .05 IU , or a 17% increase, at the 
P-.9 mastery level. 

For the total group of trainees responding to Test 31, the same t^o trends 
were noted. Fii-t, tent length was reduced with the use of the AMT procedure 
at each mastery level. The minimum reduction of test length was noted with the 

') • 



-19- 



use of the /"=.8 mastery level, for which the conventional test length of 38 items 
was reduced to a mean AMT length of £-23.4 items-a reduction of 38.4% in mean 
test length. The greatest reduction in test length was noted for the .9 mastery 
level at which the mean AMT length was 14.7-a reduction in test length of 
ox. J^ , 

The second trend was that the AMT procedure provided more inform, ion with 
each item administered than the conventional test for all mastery levels. The 
smallest increase in information was .01 IU per item (a 5% increase), for the 
;"• m ?" ery level - The largest gain in the mean information per item was .08 
IU (a 36% increase), for the P-.7 mastery level. For mastery levels P=.8 and 
P=.9, the percent reduction in test length under AMT was greater for Test 31 
than that noted for Test 11. The increase in information per item noted for 
AMT vas greater for Test 31 than for Test 11 at all three mastery levels. 

Appendix Tables B-2 and B-3 show that test lengths for the AMT procedure 
for different trainees were quite variable. For most of the trainees, either 
a very long test (as long as the conventional test) „as needed, or a very short 
test (8 items or less) was sufficient. This U-shaped distribution of test 
lengths was obtained for both Test 11 and Test 31 across all mastery levels. 

Mastery groups. When only those trainees were considered who were judged 
to be masters for Test 11 at one of the mastery levels by the AMT or the con- 
ventional testing procedure, test length reduction was again noted for the AMT 
procedure at all three mastery levels. For mastery levels P=.7 and P=.8, adap- 
tive tests for those in the mastery group were approximately the same mean 
length as those for the total group; but for mastery level P=.9 adaptive tests 
for the mastery group were much longer (20.8 versus 13.1 items on the average). 
In comparison with the conventional test, for the AMT procedure i n the mastery 
group alone the minimum test length reduction was 4.2 items, or 16.8% of the 
conventional test length of 25 items, at the P=.9 mastery level; and the maxi- 
mum test length reduction was 12.7 iterator 50.8% of the conventional test 
length, at the P=.7 mastery level. 

The AMT procedure and the conventional testing procedure provided almost 
identical mean amounts of information (J) for items administered to the mastery 
groups, even though the AMT procedure administered fawer items at each mastery 
level. However, for these groups interpretation of the differences in mean 
information (I) is obscured by the fact that the two different testing proce- 
dures gave trainees with different achievement levels mastery status. A clear- 
er comparison of information provided by the t^o testing procedures is shown 
below. 

For the groups of trainees labeled as masters for Test 31, test-length re- 
duction was observed with the use of AMT for only two of the three mastery 
levels examined. At the P-.7 mastery level, mean test length was reduced by 
18.6 items, or a reduction of 48.9% of the conventional test length, by use of 
AMT. For the P=.8 mastery level the mean test length was reduced by 10.3 items, 
or a reduction of 27.1% of the conventional test length. For the P=.9 mastery 
level the AMT procedure never reached a decision of mastery in less than 38 
items, the length of the conventional test. 

For Test 31, the AMT procedure resulted in higher mean information per 
item than tht conventional test for the ^=.7 mastery level (a difference of 



9 



ERJC 



-20- 



.08 IU per item, or a 44% increase over the conventional test) and the f-.8 mas- 
tery level (.05 IU per item higher, a 1\7 increase). At the P~.9 mastery level 
the conventional test and the adaptive test administered items with equal aver- 
age information. 

As the mastery level became higher, for both Test 11 and Test 31 there was 
a trend for greater numbers of items to be administered before a decision of 
mastery could be made. This resulted from the fact that the higher mastery 
levels fell above the steepest portion of the TCCs, as is shown in Figure 4. 
This would imply that the entire conventional test would have more difficulty 
discriminating among trainees at these mastery levels; consequently, the AMT 
procedure would have to use more of the items from the conventional test in or- 
der to determine whether a trainee was above or below the higher mastery levels. 
This trend may be clearly seen in Appendix Tables B-2 and B-3. For each test, 
trainees were placed in the mastery group for mastery level f=.7 with a wide 
range of test lengths. As the mastery level was raised, trainees were more 
likely to be declared masters only after a larger number of items were admin- 
istered, until for Test 31 at the I-.9 mastery level, all those who were de- 
clared masters took all of the items in the item pool before the mastery decision 
was made . 

.'. wict'SF* iroec* For the trainees who were declared nonmasters for Test 11 

- ,„ bd. 

using either the adaptive or conventional testing procedures, reductions in 
test length were observed at every mastery level with the AMT procedure. The 
smallest reduction in test length, 7.6 items, was observed for the :-.8 mastery 
level and accounted for 30.4% of the conventional test length. The largest re- 
duction is test length was 13.5 items at the /-.9 mastery level, or 54% of the 
conventional test length. At each mastery level for Test 11, more mean infor- 
mation was obtained from each item administered to the nonmasters by the AMT 
procedure than by the conventional procedure. The smallest increase in infor- 
mation per item was .02 IU (a 6.7% increase), for the . r =.8 mastery level. The 
largest increase in mean information was .13 TU (a 33.3% increase) per item, for 
the /=.7 mastery level. 

For the trainees declared nonmasters for Test 31, reductions in mean test 
length were again noted with the AMT procedure at each mastery level. The min- 
imum mean decrease in test length was 11.6 items, or 30.5% of the conventional 
test length of 38 items, at the F~.l mastery level. The maximum reduction in 
average test length was 27.1 items, or 71.3% of the conventional test length, 
at the *-.9 mastery level. As the criterion ?evel increased, the number of 
items needed by the AMT procedure to make the nonmasterv decision steadily de- 
creased . 

For the nonmasterv groups administer* ' Test 31, the mean information per 
item was higher at each mastery level for the AMT procedure than for the con- 
ventional testing procedure. The minimum increase in information was .06 IU 
(a 23% increase) per * rem administer ed, for the "~.8 mastery level; and the maxi- 
mum increase observed was .8 IU per iten (a 28.6?' increase), for the * = .7 mas- 
terv level. 

Across both Tests 11 and 31, there was a tendency for the adaptive test 
to administer fewer items before making a decision of nonmasterv as the mastery 
level increased. The sole exception to this trend was observed for Test 11 at 
the ~=.8 masterv level, which showed a slight increase in the numbei of items 



-21- 



adrainistered when compared with the . r = .7 mastery level for that test. For both 
tests a higher mean ii. format ion was obtained for each item administered by the 
AMT procedure at each mastery level. No consistent trend was noted in the dif- 
ferences in average information per item across mastery levels for the two tests. 

* l ^jh-^onfi len> ^ jrou£s . The high confidence groups included only those 
trainees for whom the AMT procedure terminated with full confidence, i.e., 
trainees for whom the Bayesian confidence interval failed to include the mas- 
tery le\»l at some test length at or before the exha> n of the items from 
the conventional test item pool. For Test 11 the AMT procedure terminated with 
high confidence for a minimi, cf 50% of the group of crainees, at the P=.8 mas- 
tery level. The largest high-confidence group was 77% (.7=154) of the total 
group of trainees, at the F=.7 mastery level. 

Test length was reduced considerably by the AMT procedure at all criterion 
levels for the high-confidence groups. The minimum reduction in mean test length 
was observed for the f'=.8 mastery level and was 15.1 items, or 60.4% of the con- 
ventional test length. The largest mean reduction in test length observed was 
18 items, or 72% of the conventional test length, the F=.9 mastery level. 
Modal test length for the high-confidence groups for Test 11 at all mastery levels 
was 3 items (see Appendix Table B-2), or only 12% of the length of the conven- 
tional test (an 88% reduction). The AMT procedure produced greater mean infor- 
mation per item at each mastery level. The smallest observed increase was .06 
IU (a 20% increase) per item administered, for the J*. 7 mastery level. The 
Mrgest mean increase was .15 IU per item (a 37.5% increase), at the P=.8 level. 
For Test 31 the minimum number of trainees in the high-confidence group was 117, 
or 58% of the total group, at the P=.8 mastery level. The largest high-confi- 
dence group was 151, or 76% of the total trainee group, for the mastery level 
' = .9. 



Test length for the AMT procedure was much shorter than the conventional 
test at each criterion level. The smallest ^eduction i mean test length was 
24.9 items, or 65.5% of the conventional test lei 6 th, : r the ^.8 mastery lev- 
el. The largest average reduction in test length was 30.8 items, or 81.1% of 
the total convention./ test length, for the mastery level. Similar to 

Test 11, modal test lengths for Test 31 uere quite short: 4 items for the ?=.7 
mastery level, 5 items for the ?=.8 mastery level, and 3 items (for 57% of the 
high-conf idence group) at the ?-.9 mastery level. 

The AMT procedure produced higher mean information per item than the con- 
ventional testing procedure at all mastery levels. The minimum increase in 
mean information per item was .16 IU (an increase of 66.7% over the mean infor- 
mation provided by the conventional test), for the r ^.9 mastery level. The 
maximum mean information increase that was observed vas .23 IU per item (a 112% 
increase), for the f=.7 mastery level. 

For both Test 11 and lest 31 the AMT procedure made confident decisions for 
between 50% and 77% of the total group at each mastery level. For the trainees 
in the high-confidence groups, the average adaptive test length ranged fr^m 19% 
to 39% of the original conventional ^est length, while modal test lengths were 
only 8% to 6% of the conventional test length (i.e., over 90% reduction). Also, 
th2 adaptive testing procedure resulted in 20% to 119.5% increase in the mean 
amount of informati n obt * ted per i" m over the conventional test. The in- 
crease in mean information per item ..as greater for Test 31 than for Test 11 
at all criterion levels. 



ERLC 



*~ 4 



-22- 



Covrexpon L n v v tw* en D %: "V : on s> 

Table 4 shows the Peaison product-moment (phi) correlations between the 
decisions made by the AMT and conventional testing procedures across all three 
criterion levels for Test 11 and Test 31. The lowest correlation observed .was 
.67, for Test 11 at the /-.9 mastery level. The highest correlation was .97, 
for Test 31 at the f-.8 mastery level. The correlations between mastery de- 
cisions for Test 31 were higher than for Test 11 at all mastery levels. In ad- 
dition, the average decision variance in common between the two testing proce- 
dures was 79% of the total decision variance. 

Table 4 

Phi Correlacions Between Mastery 
Decisions Made by AMT and 
Conventional Testing Procedures for 

Test 11 ar J Test 31, at Three 
Mastery Levels 







Mastery Level 




Test 


P=. 7 


P=.8 


P=.9 


Test 11 


.91 


.88 


.67 


Test 31 


.93 


.97 


.94 



To examine more completely the correspondence in decisions made by the AMT 
and conventional procedures, Table 5 shows joint frequency distributions of de- 
cisions for the two testing procedures at each of the three mastery levels for^ 
Test 11 and Test 31. The lowest level of agreement between the AMT and conven- 
tional testing 'procedures was noted for Test 11 at the P=.9 mastery level, where 
the two testing procedures agreed for 178, or 89.4% of the 199 trainees tested. 
The highest level of agreement was 98.5%, for Test 31 at the 7^,8 and P=.9 mas- 
tery levels. Across both tests and all criterion levels, the two procedures 
agreed for ^5.9% of the trainees tested. For the longer test (Test 31) the two 
procedures agreed for 97.9% of the trainees, and for the shorter test (Test 11) 
the two procedures agreed for 94.0% of the trainees. 

Table 5 

Joint Distributions of Mastery Decisions Made by AMT and 

Conventional Tests 11 and 31 at Three Mastery Levels 

Mastery Level . Test 1 1 Test 31 

and AM T Decision' M astery Nonmasterv Mastery Nonmaste ry 



AMT 


Mastery 


171 


3 


126 


6 


AMT 


Nonmasterv 


1 


' 24 


1 


67 


*.8 












AMT 


Mastery 


12b 


1 


72 


2 


AMT 
-.9 


Nonmasterv 


10 




1 


125 


AMT 


Mastery . 


28 


n 


26 


2 


AMT 


Nonmasterv 


IS 


LSO 


1 


171 



ERJC 

J 



-2 3- 



Ia formation F\ k 9 : ' 

Figures 5 and 6 show the information obtained by Conventional Tests 11 and 
31, respectively, and adaptive test i** procedures as a function of estimated 
achievement level (0). (Points plotted in these figures are based on mean infor- 
mation obtained from trainees within a plus or minus .1 range around a given ^; 
numerical values of information are shown in Appendix Table B-4.) Figures 5 and 
6 each show three adaptive testing information curves — one for each mastery 
level examined — and one conventional test curve. 

Figure 5 shows that Test 11 was poorly designed to make mastery decisions 
at middle-range mastery levels (0 between -.5 and + .5, or proportion correct of 
about ;-.75 to J-.85), since the test's information was predominantly concen- 
trated at low achievement levels (0<-l . 0), with an information spike caused by 
a single highly discriminating item (Item 28; see Table 1) at about 1.0 on the 
achievement continuum. Information functions for the AMT strategy at each of 
the three mastery levels closely approximated the conventional information func- 
tion in the region near each respective mastery level (G-. 8, -.2, -9). T n ad- 
dition, as achievement level moved away from the mastery levels, the AMT infor- 
mation functions fell below the information function for the conventional test, 
particularly at the lower achievement levels. Further, as the difference be- 
tween the achievement level and the mastery level increased, the difference in 
amounts of information used by the AMT procedure and the conventional procedure 
tended to become larger. However, for the f=.8 mastery level an upturn in the 
information function occurred below the -1.3 achievement level, and th<* differ- 
ence in information between the conventional and adaptive procedure decreased 
slightly. The same type of upturn was noted for the P=.9 mastery level, for 9 
levels belov -1.1. 

Figure 6 shows that for Test 31 the conventional test information function 
was monotone decreasing within the observed range of tra inees 1 achievement levels. 
This implies that Test 31 provided its most precise measurement at low achieve- 
ment levels and that differences between the two testing procedures should be 
most noticeable at low achievement levels. The AMT information functions for 
Test 31 in Figure 6 reinforce the trends noted in Test 11 for eacn of the mas- 
tery levels. That is, 

1. The AMT - info rmat ion functions each closely approximated the convent ion- 

test information function in the region of the achievement continuum 
near the appropriate mastery level. 

2. For achievement levels beyond the rep -i near the mastery level, the 
AMT information function was lower t\ the conventional test infor- 
mation function. 

3. The difference in informatior between the AMT and conventional testing 
procedures was greater for ac lievement levels further from the speci- 
fied mastery level, up to a point. 

4. At the lower end of the achievement continuum (G^-,5), an increase in 
the amount of information provided by the AMT procedure was noted for 
each of the mastery levels examined. The point on the 9 continuum at 
which the upturn was noted was lower for each successively lower cri- 
terion level. 

For Test 31 one additional result was noted that did not appear in the Test 11 
AMT data: For both the ^=.8 and >.9 criterion levels, a final downturn in the 




-24- 



Figure 5 

Mean Obtained Information as a Function of Estimated Achievement 
Level for AMT and Conventional Test 11 at Three Masterv Levels 




Men Information (1} 



Figure 6 

Mean Obtained Information as a Function of Estimated Achievement 
^evel for AMT and Conventional Test 31 at Three Mastery Levels 



1 




t 1 \ 1 1 ? 1 1 r 

-It L I * I • 

Mean Information (.) 



ERIC 



information functions for the AMT procedure was observed at the lowest obtained 
6 levels. This implies that the observed upturns in information may have been 
one side of an information sp'ke, possibly caused by the minimum limit of three 
items placed on the AMT procedure. 

Pitt* •utifiion ik : **> >? aliens 



The unidimensional three-parameter logistic ICC model was fit to two con- 
ventional tests that were previously used to make mastery decisions in a mili- 
tary training course. Data originally gathered during the training course 
were used to evaluate, in real-data simulation, the efficiency of the proposed 
adaptive mastery testing (AMT) procedure in terms of the number of items admin- 
istered, the information obtained, and the degree of agreement between the AMT 
and conventional testing procedures. The AMT procedure was simulated assuming 
three different mastery levels, stated in terms of the achievement metric, 
through the use of the test characteristic curves (TCCs) for the two conven- 
tional tests. The results of these simulations indicated that the proposed 
AMT procedure reduced the number of items administered during the average test, 
while at the same time making decisions which were very much the same as those 
made by the conventional testing procedure. 

The AMT procedure reduced the average test length for the entire group of 
trainees by 30% to 61% of the conventional test length. The reductions in test 
length observed varied across different mastery levels for both of the conven- 
tional tests. When specific subgroups of the samples were considered, mean 
test length reductions of up to 81% of the items in the conventional test were 
again observed in almost every subgroup examined at each mastery level and for 
both tests. The only subgroup for which no test length reduction was observed 
for the AMT strategy was the group passing Test 31 at the highest criterion lev- 
el (P=.90 correct). For the groups of trainees for which the AMT procedure was 
able to make high-confidence decisions, AMT mean test lengths were 60% to 81% 
shorter than the conventional tests across all mastery levels examined. Fur- 
ther, high-confidence decisions were made for 50% to 77% of the trainees at 
each mastery level. 

At each mastery level for each test, agreement was high between the deci- 
sions made by the adaptive and conventional testing procedures. The two pro- 
cedures made the same decision for approximately 96% of the cases across all 
circumstances. Using the larger item pool (Test 31), the two procedures agreed 
for about 98% of the cases. The lowest agrsement level observed was approxi- 
mately 39%. 

At each mastery level examined, the information func-tions jbserved for the 
adaptive tests closely approximated the information functions obtained for the 
relevent conventional test at achievement levels close to the mastery level, and 
fell below the conventional test information functions for more extreme achieve- 
ment levels. For the achievement levels very different from the mastery level, 
the difference between the information functions for the two testing procedures 
reached a maximum; and at the most extreme achievement levels the difference in 
information decreased slightly. 

Thus, the AMT procedure was shown to make mastery decisions very similar 
to those made by the conventional testing procedure, while administering fewer 
items, by using the information in the item pool that was available to make 
high-confidence decisions. 

't 



-26- 



The test-length reduction observed using the AMT procedure may be attrib- 
uted to two characteristics of Mie procedure. First, the AMT strategy adminis- 
tered to a trainee only those iteirs which provided the most precise measurement 
at the trainee's current level of kK Second, the AMT procedure terminated the 
test as soon as enough information was available to make a decision at a pre- 
determined level of confidence concerning the trainee's mastery level. The 
terminate n rule allowed the test to terminate prior to the exhaustion of the 
item pool, if enough information was available in the items, and the item ad- 
ministration procedure presented the most informative items early in the test- 
ing session. 

Each of these characteristics of the AMT procedure can be more clearly seen 
by examination of the Bayesian point estimates and the associated confidence in- 
tervals obtained from a trainee's responses after each item administered by the 
AMT and conventional testing procedures. One such record is shown in Figure 7 
for a trainee responding to Test 11. The 0 estimates plotted in- Figure 7 in- 
S elude 95% Bayesian confidence intervals for the 0 estimate after the first item 

and after every third item administered thereafter for both AMT and convention- 
al procedures (even though the confidence interva-1 was not used for making the 
mastery decision with the conventional procedure). 

Figure 7 

Achievement Level Estimates for Trainee 14 after Each Item Administered by AMT 
and Conventional Testing Procedures for Test 11, with 95% Bayesian Confidence 
intervals Indicated after Every Third Item (P=.7 Mastery Level) 



01 

> 

w 
C 

<u 
a 
u 
> 

•H 

s: 
u 

< 

U 

a 



»1 »v>t ivf Ti-st 

■ Ai h w*vt rot nt !tvt*l {■stltiwt* ('< 

niivi ut i.in i 1 Ti-st 

, + A. Mil v« miMiC i»vl {■stJnTitf 



iVJk 



II 



1 



I 
I 
I 

♦ ♦ 

I 
I 



1 1 1 • ! • ■ i 1 i " i ! i ' • 1 i ■ i 1 i 1 i 1 , 1 i 1 1 1 i 1 i 1 1 » t ' i ■ i 1 i 1 i 1 

i . , K ' M . In ! ! 1 It ' 1 . I i . 1 ! « 1 <J • . i 1 1 ' ' ' I > « . 



1. Id I 

Number of items Administered 



tt may be seen from Figure 7 that both testing procedures made a nonmastery 
decision for the trainee (i.e., determined that the trainee's true achievement 
level fell below the specified masterv level), even though both procedures 



erJc 



3. 



-27- 



estlmated the trainee's achievement level as being above the mastery level for 
the first few items. The conventional test 0 estimates were above the mastery 
level for the first 7 items; the adaptive test 9 estimates dropped below the 
mastery level after only 2 items. The AMT procedure made the mastery decision 
after administering 9 items, compared with the conventional test length of 25 
items. At each test length greater than a single item, the Bayesian confidence 
interval around the conventional test 6 estimate was larger than the confidence 
interval around the AMT 9 estimate. This indicates the greater measurement pre- 
cision available to the AMT procedure due to the adaptive item administration 
procedure . 

Furth. r, it may be noted in Figure 7 that the conventional test strategy 
finally resulted in a Bayesian confidence interval that fell completely be]ow 
the mastery level after 19 it-ms were administered (still over twice t'-e test 
length of the adaptive test); but since the conventional testing procedure does 
not terminate even after this high-confidence level is reached, 6 more items 
were administered before the test ended. This illustrative example showed that 
the AMT procedure was far more economical .han the conventional procedure in 
terms of test length, due to the adaptive item selection procedure and the use 
of the Bayesian confidence interval as a termination mechanism. 

Additional A dvaritai> -s o<r the AMT Stratum 

*■ M _ tt.M.. 

The ICC-based adaptive mastery testing strategy described in this report 
has several other advantages over conventional testing procedures jsed to make 
mastery decisions. As has been demonstrated with these data, use of the ICC 
metric and related achievement estimation procedures can result in mastery de- 
cisions for most trainees (50% to 77%) with known and predetermined levels of 
confidence. Coupled with appropriate design of mastery testing item pools using 
ICC concepts, the percentage of high-confidence decisions could be substantially 
increased until mastery decisions could be made for virtually all students at 
the same high and predetermined level of confidence. Design of such mastery 
testing item pools would include a concentration of highly discriminating items 
around the mastery level, plus sufficient numbers of highly discriminating items 
elsewnere along the achievement continuum to permit high-confidence decisions 
to be •nads for all students. Actual numbers of items required at various dis- 
crimination levels could be estimated using Owen's Bayesian scoring procedure 
and information on the difficulties and discriminations of items to estimate in 
advance the values of the Bayesian posterior variance (which is used to construct 
the Bayesian confidence intervals used in the AMT procedure) at the expected 
levels of 0. 



ERIC 



If the mastery testing item pool is not designed in advance to permit high- 
confidence decisions for each student, the AMT procedure still permits the test- 
er tc determine the confidence level of each mastery decision made, even if it 
is not a high-confidence decision. This can be determined by locating the dis- 
tance of the mastery level, 0^, from the student's estimated achievement level, 
v. This distance can then be treated as a standardized deviation from the mean 
of a normal distribution, with a variance equal to the estimated posterior var- 
iance; and .50 plus the area of the portion of the normal distribution included 
in that deviation will then give the confidence level for a given mastery deci- 
sion for that student. In this way, a confidence level for the mastery decision 
can be attached to each such decision. As a result, instructional decisions 
based on lower confidence level mastery decisions can be made more tentatively. 



33 



-28- 



A further advantage of the ICC-based AMT strategy is that it can be exten- 
ded to the multiple-content area mastery testing problem with further savings in 
test administration time. In many training environments, it is desirable to 
measure mastery on a number of learning objectives at the same point in time. 
Using conventional testing procedures to measure mascery on 6 objectives, for 
example, the student would have to take 6 different tests with a fixed number 
of items, for a potential total of over 100 items. However, since the AMT strat- 
egy utilizes the same item selection and scoring procedures that Brown and Weiss 
(1977) used in their intercontent branching adaptive testing strategy, the AMT 
strategy can operate in the same fashion; all that differs is the intrasubtest 
termination rule. Thus, in the multicontent branching AMT strategy, the achieve- 
ment level estimates used to make the mastery decisions in each of a number of 
content-based mastery tests would be used to serve as entry points for beginning 
testing (using appropriate multiple regression equations) in subsequent mastery 
tests in the battery. If there is any correlation between mastery decisions 
made on the separate subtests, the use of an intercontent branching AMT should 
result in substantial additional savings in testing time over that obtained by 
use of the AMT strategy in each subtest separately. 

The AMT procedure described above, or an improved version, should thus be 
extremely useful in a training sequence in which many subject areas are taught 
and tested within a short time, thus putting a premium on testing time. A sel> 
paced instructional setting in which a student is given more than one attempt 
to demonstrate mastery of a content area with a single test may also benefit 
from an AMT procedure that would allow students to take different itetiis on each 
attempt, thus avoiding the problem of students merely "learning! 1 the test, with- 
out learning the subject matter. 

The AMT procedure should be tested in an actual classroom situation. Fur- 
ther research should also be conducted to determine whether conventional mastery 
testing or the AMT procedure result in mastery decisions which more accurately 
predict external performance criteria. 



ERIC 



References 



Bejar, I. I., & Weiss, D. J. Computer programs for scoring test data with 
item chara cteristic curve models (Research Report 79-1). Minneapolis: 
University of Minnesota, Department of Psychology, Psychometric Methods 
Program, January 1979. (NTIS No. AO A067 752) 

Bejar, I. I., Weiss, D. J. , & Gialluca, K. A. An information comparison of 

conventional and adaptive tests in the measurement of classroom achieve- 
ment (Research Report 77-7). Minneapolis: University of Minnesota, 
Department of Psychology, Psychometric Methods Program, October 1977 
(NTIS No. AD A047495) 

Bejar, I. I., Weiss, D. J., & Kingsbury, G. G. Calibration of an item pool 
for the adaptive measurement of achievement (Research Report 77-5). 
Minneapolis: University of Minnesota, Department of Psychology, Psycho- 
metric Methods Program, September 1977. (NTIS No. AD A044828) 

Birnbaum, A. Some latent trait models and their use in inferring an examinee's 
ability. In F. M. Lord & M. R. Novick, Statistical theories of mental 
test scores . Reading, MA: Addison -Wesley , 1968. 

Brown, J. M. , & Weiss, D. J. An adaptive testing strategy for achievement 
test batteries (Research Report 77-6). Minneapolis: University of 
Minnesota, Department of Psychology, Psychometric Methods Program 
October 1977. (NTIS No. AD A046062) 

Davis, F. B., & Diamond, J. J. The preparation of criterion-referenced tests. 

In C. W. Harris, M. C. Aikin, & W. J. Popham (Eds.), Problems in criterion- 
reterenced measurement. Los Angeles, CA: UCLA Graduate School of 
Education, Center for the Study of Evaluation, 1974. 

Ferguson, R. L. The development, implementation, and evaluation of a computer- 
assisted branched test for a program of individually prescribed instruction 
(Doctoral Dissertation, University of Pittsburgh, 1970). Dissert ion 
Abstracts International. 1970, 30, 3856A. (University Microfilms No. 
70-4530) 

Fisher, R. A. Contributi ons to mathematical stat istics. New York, NY: John Wilev 
& Sons, 1950. 3 

Glaser, R. , & Klaus, D. J. Proficiency measurement: Assessing human perfor- 
mance. In R. M. Gagne (Ed.), Psychological principles in system develop- 
ment. Chicago, IL: Holt, Rinehart, & Winston, 1962. 

Hambleton, R. K. , & Novick, M. R. Toward an integration of theory and method 
l" ^* t " ( j on ~ referenced tests - Journal of Educational Measurement . 1973, 

Hambleton, R. K., Swaminathan, H., Algina, J., & Coulson, D. Criterion- 
referenced testing and measurement: A review of technical issues and 
developments. Review of Educational Research , 1978, 48, 1-48. 



-30- 



Horn, J. L. A rationale and test for the number of factors in factor analysis. 
Psychometrika , 1965, 30, 179-185. 

Kingsbury, G. G., & Weiss, D. J. Relationships* among achievement level esti- 
mates from three item characteristic curve scoring methods (Research 
Report 79-3). Minneapolis: University of Minnesota, Department of 
Psychology, Psychometric Methods Program, April 1979. 

Livingston, 5. A. Criterion-referenced applications of classical test theory. 
Journal of Educational Measurement , 1972, 9^, 13-26. 

Lord, F. M. Discussion. In W. A. Gorman v Chair) , Computers and testing: 

Steps toward the inevitable conquest (PS-76-1) . Washington, DC: U.S. 
Civil Service Commission, Personnel Research and Development Center, 
September 1976. (NTIS No. PB-261-694) 

Lord, F. M. , & Novick, M. R. Statistical theories of mental test scores . 
Reading, MA: Addison-Wesley , 1968. 

McBride, J. R., & Weiss, D. J. Some properties of a Bayesian adaptive ability 
testing strategy (Research Report 76-1). Minneapolis: University of 
Minnesota, Department of Psychology, Psychometric Methods Program, March 
1976. (NTIS No/ AD A022964) 

Nie, N. H., Hull, C. H. , Jenkins, J. G. , Steinbrenner , K. , & Bent, D. H. 

Statistical package for the social sciences . New York, NY: McGraw-Hill, 
1970. 

Owen, R. J. A Bayesian approach to tailored testing (Research Bulletin 69-92). 
Princeton, NJ: Educational Testing Service, 1969. 

Popham, W. J. (Ed.), Criterion-reference 1 measurement — an introduction . 
Englewood Cliffs, NJ: Educational Technology Publications, 1971. 

Popham, W. J., & Husek, T. R. Implications of criterion-referenced measurement. 
Journal of Educational Measuremen t, 1969, f>, 1-9. 

Reckase, M. D. Unifactor latent trait jnodols applied to multifactor tests: 

Resultsand implications. In D. J. Weiss (Ed.), Proceedings of the 1977 
computerized adaptive testing conferenc e. Minneapolis: University of 
Minnesota, Department of Psychology, I'sychometric Methods Program, 1978. 

Samejima, F. Estimation of latent ability using a response pattern of graded 
scores. Psychometrika Monograph Supplement , 1969, 34 (4, Pt . 2, 
Monograph No. 17) . 

Urry, V. W. A five year quest: Is computerize J adaptive testing feasible? 

In C. L. Clark (Ed.), P roceedings of the first conference on computerized 
adaptive testing (U.S. Civil Service Commission, Research and Develop- 
ment Center, PS-75-6) . Washington, DC: U.S. Government Printing Office, 
1976. (Superintendent of Documents Stock No. 006-000-00940-9) 

Urrv, V. W. Tailored testing: A successful application of latent trait theory. 
Journal of Ed uc ational Measurement , 1977, 14^ 181-196. 



-31- 



e, C. U. , & Weiss, D. J. A study of computer-administered stradapt iye 
ability testing (Research Report 75-4). Minneapolis: University of 
Minnesota, Department of Psychology, Psychometric Methods Program, October 
1975. (NTIS No. AD A018758) 

ss ' D * J * The stratified adaptive computerized ability test (Research 
Report 73-3). Minneapolis: University of Minnesota^ Department of Psy- 
chology, Psychometric Methods Program, September 1973. (NTIS No. AD 
768376) 



37 



Aprendix A 



Ill^tration *!TFr fyc?* I <r^ ?'*.e?sini Items ^or AMT 

. *~ *_ ^ d . 

The essential characteristics of the adaptive testing strategy employed in 
this study have been described in previous sections. However, to understand the 
method more completely, it is helpful to see the results of its application with 
an actual testee. 

Figure A-l shows estimated item information curves for six items from Test 1. 
(There would probably be many items in the test, but only six were chosen to sim- 
plify the illustration.) The height of the information curve at a given achieve- 
ment level (6) indicates the amount of information provided by the item. Most 
of the items are fairly "peaked"; that is, thev provide information over a rel- 
atively narrow range of the achievement continuum. While the information curves 
overlap to some degree, different items provide different amounts of information 
at a given point on the achievement continuum. The guiding principle for the 
adaptive procedure is to administer the item which provides the most information 
at the current achievement estimate (8). 

Figure A-l 

Estimated Item Information Curves for Six Items from Test 1 



2.0 fc 




e 

Achievement Leve 1 



For a testee beginning Test 1, the initial achievement estimate was G=0; 
this is shown by the vertical dashed line in Figure A-l. Of the six items in 
the example, only three items had essentially nonzero information values at 8=0; 
these values, shown by the horizontal dotted lines in Figure A-l, were .95 for 
Item 5, .60 for Item 15, and .10 for Iter 12. Applying the rule that the item 
selected is the one which provides the most information at the current p » 
Item 5 would be selected for administration. 



ERLC 



-33- 



Figure A-2 shows the revised value of 0-.46 derived from the Bayesian 
scoring routine, assuming that a correct answer was given to Item 5, The con- 
fidence interval surrounding this 9 is assumed to contain the mastery level, 
so testing would continue. The information curve for Item 5, which was al- 
ready administered, is not shown in Figure A-2. At the new value of 9, only 
Items 15 and 12 provide significant values of information. Since Item 15 has 
an information value of .60 and Item 12 has a value of .20, Item 15 would be 
selected as the second item to be administered to this testee. 



Figure A-2 

Estimated Item Information Curves for Five Items from Test 1 



2.0 



c 1 5 
c 



£ 1.0 



0.5 ■ 



0.0 




-l o B l 

Achievement Level 



Assuming that the testee had correct] y answered Item 15, the value of 9 
would increase to .92. The confidence interval around this new § still contains 
the arbitrary mastery level, so testing would continue. At 9=.92, only Item 12 
would provide significant amounts of information, and it would be administered 
next. Thus, at each step during the testing procedure, the item which provides 
the most information concerning the testee's current level of G is administered. 
In a larger item pool, testing would continue in this fashion until it was pos- 
sible to make a mastery decision with a prespecified level of confidence, at 
which point the test wou" d terminate. 



ERIC 



30 



Appendix B * 
Supplementary Tables 



Table B-l 

Eigenvalues of the First 10 Common Factors Extracted 
from Item Intercorrelations for Test 11 and Test 31, 
and for Parallel Random-Data Factors 



Test 11 Test 31 





Real 


Random 


Real 


Random 


Factor 


Data 


Data 


Data 


Data 


1 


6.14 


1 


.75 


10.23 


2.04 


2 


1.85 


1 


.61 


3.08 


1.90 


3 


1.60 


1 


.52 


2.10 


1.84 


4 


1.41 


1 


51 


2.06 


1.82 


5 


1.38 


1 


50 


1.82 


1.76 


6 


1.30 


1 


.39 


1.68 


1.72 


7 


1.24 


1 


33 


1.58 


1.62 


8 


1.16 


1 


28 


1.49 


1.58 


9 


1.15 


1 


.25 


1.38 


1.56 


10 


.97 


1 


.20 


1.31 


1.48 



9 



ERIC 



Table 2 

Frequency Distributions of Number of Items Administered by 

AMT Procedure from Test 11 by Mastery Subgroup for 
Each Mastery Level (P=. 7 , .8, and .9) 



Group 



Number of High 

Items Total Mastery Nonmastery Confidence 

Administered P°. 7 P=.8 P=.9 P=,7 ^=,8 P=,9 P=.7 P=.8 P=.9 P=. 7 P=.8 P=.9 



3 


43 


54 


45 


39 


39 




4 


15 


45 


43 


54 


45 


4 


1 


1 


24 








1 


1 


24 


1 


1 


24 


5 


36 


1 


10 


36 








1 


1*0 


36 


1 


10 


6 


1 




3 








1 




3 


1 




3 


7 


10 


3 


17 


10 




8 




3 


9 


10 


3 


17 


8 


13 


2 


2 


13 


1 






1 


2 


13 


2 


2 


9 


3 




2 








3 




2 


3 




2 


10 






1 












1 






1 


11 


7 




6 


7 










6 


7 




6 


12 


1 




3 








1 




3 


1 




3 


13 


3 


2 




3 








2 




3 


2 




14 


1 


2 


1 


1 










1 


1 


2 


1 


15 




3 


2 




2 






1 


2 




3 


2 


16 


1 


1 


1 


1 








1 


1 


1 


1 


1 


17 


1 


7 


4 




6 




1 


1 


4 


1 


7 


4 


18 


7 




2 


7 










2 


7 




2 


19 


4 


6 


1 


1 






3 


6 


1 


4 


6 


1 


20 


4 






4 












4 






21 




1 


3 










1 


3 




1 


3 


22 




2 


2 




2 








2 




2 


2 


23 


1 




3 


1 










3 


1 




3 


24 


7 


9 




7 


8 






1 




7 


9 




25 


55 


J05 


67 


44 


68 


26 


11 


37 


41 


10 


6 





i'J 



Table B-3 

Frequency District ions of Number of Items Administered by AMT Procedure 
From Test 31 by Mastery Subgroup for Each Mastery Level (P=.7, .8, and .9) 

Group _ 

Number of High 

1 1 ems Total Mastery Nonmastery _ Confidence 

A dministered P-.7 p~.8 ^ .9 P*,7 P=,8 P*. 9 7 P*.8 P=.9 P*. 7 P=. 8 P= . 9 



J 




7 


86 






7 


7 
/ 


RA 


7 
/ 


7 


o u 


4 


27 


1 


11 
X X 


27 






1 

X 


1 1 

X X 


27 


1 

X 


l l 


5 


4 • 


15 


7 






4 


1 s 

X J 


7 


A 


1 S 

X J 


7 

f 


6 


8 ' 


6 


A 


7 




1 


A 

u 


A 

u 


ft 

o 


A 


A 

D 


7 


10 


10 


4 


1 0 


7 






A 


1 0 

X V 


1 0 

X \J 


A 


8 


» 6 


4 


2 


5 




1 


A 


? 


A 


A 


2 


9 


6 


3 


1 


5 




1 


3 


1 

L 


A 
u 


3 


l 

X 


10 


7 


- 6 


7 


5 


4 


2 


2 


7 


7 




7 


11 


1 


10 


4 




3 


1 


7 


4 


] 


10 


4 


12 


1 


4 


3 


1 


1 




j 


3 


1 

X 


A 


j 


13 * 


2 


g 


2 


2 






8 


2 


2 


ft 

O 


2 


14 


5 


1 


2 


4 




1 


1 


2 


5 


1 


2 


15 


3 


2 




3 






2 




3 


2 




16 


5 


2 


2 


5 






2 


2 


5 


2 


2 


17 


2 fc 


6 




1 


5 


1 


1 




2 






18 


3 


6 




1 


5 


2 


1 




3 






19 


4 


4 


1 


1 


3 


3 


1 


1 


4 


4 


1 


20 


2 


1 




1 




1 


1 




2 


1 




21 


3 


4 




2 




1 


4 




3 


4 




?2 






I 










2 






2, 


23 


5 


2 




4 




1 


2 




5 


2 




24 


1 


5 


1 


1 


2 




3 


1 


1 


5 ' 


1 


25 


1 


1 




1 






1 




1 


1 




26 
























27 


1 


1 




1 


1 








1 


1 




28 




1 


1 




1 






1 




1 


1 


29 


3 


1 






1 


3 






3 


1 




30 


1 




2 






1 




2 


1 




2 


31 


2 


1 




1 




1 


1 




2 


1 




32 




1 










1 






1 




33 




1 


1 




1 






1 




1 


1 


34 






2 










2 






2 


35 


i 






1 










1 






36 


1 


2 


2 






1 


2 


2 


1 


2 


2 


37 




1 


2 








1 


2 




1 


2 


38 


78 


83 


49 


43 


40 28 


35 


43 


21 









4; 



Table B-4 

Mean Information (I) Obtained by AMI and Conventional Testing Procedures for Tests 11 and 31 
At Three Mastery Levels (P-.7, % 8, and .9) for Trainees with Various Achievement Level 

Estimates (9), and Numberof Trainees (N) at Edch Achievement Level 



Test 11 



AMT 



Range Conventional (P«. 7) 



(F-.8) 



Lo 




Hi 


I 


/V 


-2.000 




.800 




■ i J 


1 


-1.799 


-1 


.600 


1 1 
1 1 


ftft 
■ oo 


5 


-1.599 


-1 


.400 


1 1 

X. X 


59 


3 


-1.399 


-1 


.200 


1 0 

X w 


Sft 

• -JO 


3 


-1.199 


-1 


.000 


Q 

o 


87 


10 


-.999 




.800 


7 


A 7 


4 


-.799 




. 600 


6 


.61 


X L. 


-.599 




.400 


5 


.41 


10 


-.399 




.200 


4 


.65 


14 


-.199 




.000 


4 


.04 


21 


.001 




.200 


3 


.73 


15 


.201 




.400 


3 


.72 


19 


.401 




.600 


4 


.91 


19 


.601 




.800 


8 


.86 


22 


.801 


1 


.000 


17 


.CI 


16 


1.001 


1 


.200 


16 


.45 


9 


1.201 


1 


.400 


6 


.35 


3 


1.401 


1 


.600 


3 


70 


3 


1.601 


1 


.800 


1 


63 


5 


1.801 


2 


000 









I 



7.42 
10.27 
10.46 
8.93 
7.68 
6.63 
5.51 
4.73 
3.79 
2.69 
2.21 
4.69 
8.80 
14.40 



1.29 39 



I 



2. 

7 

4 

6 

6 
10 

9 

9 
16 
24 
47 

4 
11 

1 



6 


.20 


7 


4 


.47 


4 


.46 


12 


3 


.19 


6 


.56 


4 


1 


.60 


7 


.17 


11 






6 


.69 


10 




.34 


5 


.56 


10 


2 


.53 


4 


.68 


18 


3 


05 


4 


03 


18 


3 


68 


3 


vt- 


"20 


3 


64 


3 


71 


9 


3 


73 


4 


60 


9 


4. 


56 


8. 


37 


16 


9. 


41 


15. 


81 


• 9 


16. 


09 


13. 


94 


3 


17. 


33 








6. 


10 


1. 


29 


39 


3. 


16 








1. 


41 



47 
25 
20 
10 
15 
13 
12 
9 
9 
7 
2 
3 
8. 



Test 31 



AMT 



(P=.9) C onventional (P=. 7) 



r 



(P°-8) 



23.44 
18.08 
15.53 
12.31 
10.11 
9.30 
8.79 
8.46 
8.06 
7.58 
6.96 
6.48 
6.05 
5.78 
5.24 
4.89 
4.05 
2.94 



N 



2 
1 
10 
9 
18 
21 
15 
'19 
18 
12 
15 
16 
7 
6 
11 
6 
4 
5 



19.58 
10.10 
9.36 
9.19 
10.07 
9.35 
8.78 
8.43 
8.06 
7.57 
5.83 
3.88 
2.60 
2.93 
1.97 



1 

8 
12 
11 

7 
21 
15 
17 
15 

4 
15 
23 
18 

5 
27 



4.82 
7.23 
6.57 
4.71 
5.80 
7.61 
8.40 
8.04 
7.51 
6.95 
6.50 
6.00 
5.61 
4.67 
3.51 
2.20 



4 
10 
8 
22 
28 
12 
17 
17 
12 
12 
14 
6 
9 
13 
8 
7 



(P-.9) 



I 



8.00 

10.36 
2.17 



.24 
.70 
.96 
.10 



6.89 
6.51 
6.00 
5.74 
5.18 
4.86 
3.84 
3.41 
2.13 



N 



4 
25 
24 
58 
22 
11 
4 
8 
6 
6 
10 
6 
3 
1 
5 



DISTRIBUTION IIS? 



Navy 



Dr. Ed Aiken 

Navy Personnel RAD Center 
San Diego, CA 92152 

Dr. Jack R, Bor sting 
Provost & Academic Dean 
U.S. Naval Postgraduate School 
Hon*,erey. CA 939^0 

Dr . Robert fireman 
Code N-M 
NAVTRAEQUIPCEN 
Orlando, FL j23l 5 

HR. MAURICE CALLAHAN 
Per s 23a - 

Bureau of Naval Personnel 
Washington, DC 20370 

Dr, Richjfd El ster 

Department of Administrative Sciences 
Naval Postgraduate School 
Monterey; CA 93940 

DR. PAT FFDERICO 

NAVY PERSONNEL RID CENTFR 

SAN DIEGO, CA 921V 

Dr. Paul Foley 

Navy Personnel RiD Center 

San Diego, CA 9^152 

Dr , John Fo^d 

Navy Personnel RAD Center 

San Diego, CA 92152 

CAPT, D.M. GRAGG, MC. USN 

HEAD. SECTION ON MEDICAL EP'JCAT'.v'k 

UNIFORMED SERVICES UNIV. OF THE 

HEALTH SCIENCES 
6917 ARLINGTON ROAD 
8FTHESQA, MD 200 \k 

CDR Robert S. Kennedy 
Naval Aerospace redical anJ 
Research Lao 

New Orleans, LA 70 if ) 

Dr. Ncrrr tl n J. K"rr 
ChTof of Naval Technical T r«Knlng 
Ndvui Air Station Jienphis (, r 5) 
f'xlUng ton. TN 59054 

Dr . Leonard Kroeker 
Navy personnel R&D Cerrter 
:,an Diego, CA V2152 

LHAIRMAN , LEADERSHIP 4 LA* DEPT. 
DIV. * PROFESSIONAL DEVEU PMMFNT 
U.:, NAVAL ACAQUIYY 
ANNAPOLIS, MD 2U02 

Dr. Millian L. Maioy 
Principal Civilian Advu r for 

FdiK.it ion and Training 
«av„i Training Command , Code 0°A 
Pensacola, FL J2503 

CAPT Richard L. Martin 

H3S Francis Marlon N { IPA-Z ) ' 

FPU New York, NY 095^1 



' Dr . James McBride * 

Code j01 

Navy Personnel R4D Center 
San Diego, CA 9215? 

2 Dr . James McGrath 

Navy Personnel R4D Center ~ 

Cod* 306 

San Dl^go, CA- 92152 

1 DR. WILLIAM MONTAGUE 
LRDC 

UNIVERSITY OF PITTSBURGH 
3 Q 5 y O'HARA STREET 
PITTSBURGH, PA 15C13 

1 Commanding Officer 
Naval Health Research 

Center 
Attn: Library 
San Diego, CA 72152 

1 Naval Medical R4D Command 
Code <f<4 

National Naval heJlcal Center 
Bethesda, MD 2001*1 

Library 

Navy Personnel RAD Center 
San Diego, CA 9? 152 

6 Commanding Officer 

Naval Research Laboratory 

Code 2627 

W t 'ington, DC 20j90 

1 OFFICE OF. CIVILIAN PERSONNEL 
(CODF 

DFPT. OF THF NAVY 
WASHINGTON, DC 20^90 

1 JOHN PLSEN 

CHZEF OF NAVAL EDUCATION 4 

TRAINING SUPPOkT 
PErSACOU, FL 32509 

1 r Aychologist 

w R Br .inch ' f ! iv, e 
^95 Sumner rtreet 
boston, MA 02? 1 n 

* Ps/cholog l st 

ONR Branch Office 
5j6 S. Clark Street 
Chicago, IL 6O6O0 

1 OTfict* of Naval Research 
Code 200 

Arl ington , VA 22217 

1 Cole U36 

Dfficc of Nav.a Research 
Arlington. VA WW 

Office of N;w«l Research 
Code U 37 . 
° nn N. Ouim.y . i >et 
Arlington, VA ?J2l7 



1 Psychologist 

OFFICE OF NAVAL RESEARCH BRANCH 
223 OLD MARYLEBONE ROAD 
LONDON, NW, 15TH ENGLAND 

1 Psychologist 
- ONR Branch Office 

1030 East Green Street 
^ Pasadena, CA 91101 

1 Scientific Director 

Offioe of Naval Research 
Scientific Liaison Group/Tokyo 
American Embassy 
APO San Francisco, CA 96503 

1 Office of the Chief of Naval Operations 
Research, Development, and Studies Branc 

(0P-1U2) 
Washington, DC 20350 

L 

1 Scientific Advisor to the Chief of 
Naval Personnel (Pers-Or) 
Naval Bureau of Personnel 
Room ^10. Arlington Annex 
Washington, DC 20370 

1 LT Frank C, Petho, MSC. USNR (Ph.D) 
Code L51 

Naval Aerospace Medical Research Uborat 
Pensacola, FL 32508 

1 DR, RICHARD A. POLLAK 

ACADEMIC COMPUTING CENTER 
U.S. NAVAL ACADEMY 
ANNAPOLIS, MD 2H<02 

1 Roger W. Remington, Ph.D 
Code L52 
NAMRL 

Pensacola , FL 32503 

1 Dr. Bernard Rimland 

Navy Pera- n e l RAD Center 
San Die b A 92152 

• Mr. Arnold Rubenstein 

Naval Personnel Support Technology 
Naval Material Command (0ST2W 
Room 10114, Cryr^al Plaza #5 
2221 Jefferson Davis Highway 
Arl ington, VA 20360 

1 Dr. Worth Scanland 

Chief of Naval Education and Training 
Code N-5 

NAS, Pensacola, FL j2503 

1 A. A, SJOMOLM 

TECH. SUPPORT. CODE 201 
NAVY PERSONNFL R4 D CFNTER 
SAN DIEGO, CA 92152 

1 Mr , Robert fin 1th 

Offlc of Chief of Naval (Operations 
OP-937F 

Washington, DC 20 35 n 



Personnel t Training Rest'drch Programs 

Office of Naval Research ] 
Arlington, VA 222 W 



Dr. Alfred F. Smode 

Training Analysis & Evaluation Group 

'TAEG) 
i'ept. of the Navy 
'Ylando, FL ^81 \ 



1 Dr. Richard Sorensen 

Navy Personnel WD Center 
San Diajo. CA 97152 

1 'tttl Charles J. Theisen. JH. HSC , USN 
Bee4 Human Factors Engineering Di . 
Naval Air Development Center 
Vjrminster. PA 1997** 

1 W. dry Thomson 

Naval Pee in Systems Center 
Code 71 p 

"an Di*»go, CA ( )2) r >2 

1 Or. RoTil<J deil&jun 

Department of Adri Inlstrutiv i [tieiiA-f 
U. S. Navul Postgraduate Tchool 
Monterey, CA 919*10 

1 DR. MARTIN F, WITKOFF 

NAVY PER^^iNEL R4 D CFKTFF 
SAN DlrC* , CA 92Vj2 



Army 



Technical Director 

U. n . Army Research institute for the 1 

Behavioral and Social ocu-nces 
50O \ Eiaenhowor Avenue 
Alexandria. VA 223}3 

H'w UStREUE * 7th Army 

USMREUF Director of GEl' 
APO New York WO J 

1 

LCOL Gary Ploedorn 

Training Effectiveness /n*lysia Division 
|J3 Army TRADOC Systems Analysis Activity 
White Sands Hissile Range. NM 63002 j 

M. PALP!! DUSEK 

U.S. ARMY RESEARCH INSTITUTE 

SO"! ET3FNHPWFR AVENUE ! 

ALEXANDRIA, VA 22 <3 i- 

nr. Myron Fischl 

U.S. Ar-iy Research Institute for the 1 

r ocial find Behavioral >ci ences 
5 n 01 Eisenhower Avenue 
Alexendrla. VA 22j** 

1 

Dr . Ed Johnson 
Army peaearch Institute 
5001 Eisenhower Blvd. 
Alexandria. VA 22f 3_< 

Dr. Michael Kaplan j 
').S. ARMY RESEARCH iNSTIT'Tf 
7^1 EISENHOWER AVENUE 
ALEXANDRIA. VA 22j33 

Dr . Milton S. Katz ' , 

Individual Training K Sk 1 1 1 

Evaluation Technical Art'd 

U.S. Army Research Institute 

5001 Eisenhower Avenue 1 

Alexandria. VA ?2;;< 

Dr. Beatrice J. Farr 

Army P*SeDrch Institu'.e (PEfU-OK) , 
S0O1 risrnhow*»r Av*nue 
Al«-x«*r 1rWi, VA 27 P 



Dr. Milt Haier 

U.S. ARMY RESEARCH INSTITUTE 
5001 EISENHOWER AVENUE 
ALEXANDRIA. VA 22333 

Dr . Harold F. O'Nell . Jr . 
ATTIf: PERI-OK 
5001 EISENHOWER AVENUE 
ALEXANDRIA, VA 22tt> 



Dr . Robert Rosa 

U.S. Army Research Institute for the 

Social .irid Hehavior.jl Sciences 
5001 riaenhower Avenue 
Alexandria, VA 22 5 13 1 

Dr. Robert oJSmor 

U. Z. Army Resejrch Institute fur the 

0 ^jv'-orjl and Social Sciences 
V 1 1 Ei 3**nh.iwer Avenm 1 
AlexmidrUi, VA 

Dire*, tor , Irmrung Develops nt 
•J.', iVmy AlnimstruT lun Irntt r 
A [TV Pr. t»i- mil 

. Denjptnin fjirrison, IN 'l^l* 

Dr. Frederick Steinheiser 

U. S. Army Reserch Institute 

5001 Eisenhower Avenue y 

Alexandria, VA ?2J3i 

Or , Joseph Ward 

U.S. Army Research Institu^* 

5001 Fi senhower Avenue ^ 

Alexandria, VA l'?Ji3 



Dr . Malcolm Ree 
AFHRL/PFD 

Brooks AFB, TX 78235 



Mar lnes 



1 H. Wil liam CJre^nup 

Fducotion Advisor (E0}1) 
Education Center, MCDEC 
Ouantico, VA P21 3*4 



Air Force 



Air Force Human Resources L^b 

AFHRL/PFH 

Brooks AFft, TX 78? 

Air University Library 
AUL/L3E 7ft/****3 
Maxwell AFD. AL 

Dr. Philip De l*o 
AFHPL/TT 

Lowry AFP, CO fln?30 

DR. 0. A. LCKVTRANP 
AFHRL/AS 

WPIGHT-PATTJR^^: A C H, OH ^'l^ 

Dr. Genevieve Had dad 
Program Manager 
Life Sciences Director it" 
Af- OSR 

Boiling AFG, DC ?P « *? 

cm. MFR f FR 
CNFT LUT, '^FICFR 
AFHRL/FLYIMG TRAlNlfH, 0 T V. 
WILLMM., AFP, A7 'S?2<4 

Or. Rood L. Mcr^m fAFHU/V 1 ! 
Wright -Patterson AFB 

Ohio lib 1 *! i 

Dr. Roger Penneil 
AfMPL/TT 

Lowry AFP, C" ■« ^ r 

P^r.^nn^. Am 1 y lis Di v i Mun 
HC USAF'DPXXA 
^iViin^ton, DC <~> n >r 



Director, Office of Manpower Utilization 
HO, Marine Corps (MPU) 
PCB, Bldg. 2009 
Quantico, VA 22H4 

PR . A.L. SLAF^OSKY 
SCIENTIFIC ADVISOR (CODF RD-1 ) 
HO. U.S. MARINE CORPS 
.WASHINGTON, DC 20*8" 



CoastGuard 



V. Richard Lantermon 
PSYCHOLOGICAL RESEARCH (G-P-1/62) 
U.S. COAST GUARD HQ 
WASHINGTON. DC 20590 

Dr . Thomas Warn 
U. S. Coast Guard Institute 
P. 0. Substation ifl 
Oklahoma City, OK 731 69 



Other DoD 



1? Defense Documentation Center 
Camerop Station,, Bldg. 5 
Alexandria. VA 22^14 
Attn: TC 

1 Dr. Dexter Fletcher 

ADVANCED RESFARCM PROJECTS AfiEHCY 
UOO WILSO»| BLVD. 
ARLINGTON, VA ?2?09 

1 Dr . Wi 1 li«jm Orahvn 
Testing Hirectorate 
MFPCOM 

Ft. .She. id an. a nf»n?7 

1 Military Assist ml for Training end 

Personnel Technology 
office of the Under Secretary of Ivfense 

for Research & Engineering 
Boon ;0 129. The Pentagor. 
, ashin^ton , DC ^0^01 

1 MAJJR Wayne U'lUan, USAF 

Office of the Assistant Secretary 

of Defense ( HRAJ ) 
jnrjii f^e Pentagon 
Washington , DC 20 30 1 



o 

ERLC 



R^s»»arch Kr^icr 

AFrpc/np*np 

^an lolph. AFFi, TX ?• m tJ 



40 



Civil Govt 



Dr. Susan Chipman 
Basic Skills Program 
National Inaticute of Education 
1200 19th Street NW 
Washington, DC 20206 

Dr. William Gorham, Director 
Personnel R4D Center 
Office of Personnel Managment 
1900 E Street IJW 
Washington, DO P0415 

Dr. Joseph I, Upson 
Division of Science Education 
Noon W-638 

National Science Foundation 
Washington, DC 20550 

Dr . John flays 

National Institute of Education 
1200 i9th Street NW 
Washington, DC 20202 

Or* Arthur helmed 
National Intitute of Education 
1200 19th Street NW 
Washington, DC 20203 

Or. Andrew P, Molnar 
Science Education Dev. 

and Research 
National Science Foundation 
Washington, DC 20050 

Dr. Lalitha P. Sanathanan 
Environmental Inpact Studies Division 
Argonnc National Laboratory 
9700 5. Cjss Avenue 
Argonne, IL G0439 

Dr. Jeffrey Schiller 
National Institute of Education 
1?on nth St. NW 
Washington, DC 2020H 



Or . Thomas 0. Stl»'ht 
P*slc Skills Progr^n 
NatWnal Institute of Education 
1200 10 t r 5treet MW 
Wjshlngton , DC 232PA 

Dr. Vern W. Urry 

P^sonnel RAD Center 

Office of Personnel M*jnarfrrent 

1900 E Street NW 
•jhmgton, IX ?0*il^ 

Hr. Joseph L. Younj, Director 
»lemory * Cognitive Processes 
Natioml Icience Foundation 
WiShin^ton, DC 20^0 



Dr. F^rl A. AUuisi 
HC, AFHRL (AF^C) 1 
Hrooks AFP. IX 7S215 

Dr. Erlir,,? p. Anderson 
University o<" Copenhagen 
itud lestraed t 

Copenhagen 1 
DENM*«K 



1 1 psychological research unit 
Dept. of Defense (Army Office) 
Campbell Park Offices 
Canberra ACT 2600, Australia 

1 Dr . Alan Baddeley • 

Medical Research Council 

Applied Psychology unit 
15 Chaucer Road 
Cambridge CB2 2EF 
ENGLAND 

1 Dr. Isaac Bejar - 

Educational Testing Service 
Princeton, NJ 08M5O 

1 Dr . Warner Birice 
Streitkr«ieftearat 
Rosenberg 5 300 . 
Bonn, West Germany D-5300 

1 Dr , R. Darrel Bock 

Department of Education 
University of Chicago 
Chicago, IL 60637 

1 Dr . Nicholas A. Bond 
Dept. of Psychology 
Sacramento State College 
6C0 Jay Street 
Sacramento, CA 95819 

1 Dr . David G. Bowers 

Institute for Social Research 
University of Michigan 
Ann Arbor , MI Mfimfc 

1 f Dr. Robert Brennan 

Anerican College Testing Programs 

P. 0. Box 163 

Iowa City, IA 522M0 

1 DR. C. VICTOR BUNDERSON 
WICAT INC. 

UNIVERSITY PLAZA, SUITE in 
1160 30. STATE ST. 
OR EM, UT 8H057 

1 Dr . John B. Carroll 
Psychometric Lab 
Univ. of No. Carolina 
Davie Hall 013A 
Chapel Hill , !IC 2/5l« 

1 "h,iries Myers Library 
Livingstone House 
Livingstone RoaJ 
"tr tford 
London F?5 ?LJ 
ENGLAND 

1 Dr . John Chiormi 
Li Lton-'4l Ionics 
Bo* 128b 

Springfield, VA 22151 

Dr. KenneU fr. Clark 
College of Arts 4 Sciences 
'Jniversity of Rochester 
River Campus Station 
Rochester , NY 1«6?7 

Dr . Ncrtnan ci i f f 
Dept . of Psychology 
Univ. of So. California 
University p ar k 
Los Angeles, CA 90007 

Or. William Coffman 
Iowa Testing Programs 
University of Iowa 
Iowa City, IA S??M2 

46 



Dr. Allan M. Collins 
Bolt Beranek 4 Newman, Inc. 
50 Moulton Street 
Cambridge, Ha 0213b* 

Dr . Meredith Crawford 

Department of Engineering Administration 
George Washington University 
Suite 805 

2101 L Street N. W. 
Washington, DC 20037 

Or . Hans Cronbag 
Education Research Center 
University of Leyden 
Bochaavelaan 2 
Leyden 

The NETHERLANDS 
MAJOR I. N. EV0NIC 

CANADIAN FORCES PERS. APPLIED RESEARCH 
1107 AVENUE ROAD 
TORONTO, ONTARIO, CANADA 

Dr. Leonard Feldt 

Lindquist Center for Measurment 

University of Iowa 

Iowa City, IA 522M2 

Dr. Richard L. Ferguson 

The American College Testing Program 

P.O. Box 168 

Iowa City, IA 52240 

Dr . Victor Fields 
Dept. of Psychology 
Montgomery College 
Rockvllle, KD 20850 

Dr. Gerhard t Fischer 
Liebigasse 5 
Vienna 1010 
Austr la 

Dr. Donald Fitzgerald 
University of New En^Und 
Armidale, New South Wales 2351 
AUSTRALIA 



Dr , Edwin A. Fleishman 

Advanced Research Resources Organ. 

Suite 900 

*»3*0 East West Highway 
Washington, DC 20014 

Or. John R. Frederiksen 
Bolt Beranek 4 Newman 
50 Moulton Street 
Cambridge, MA 021 36 

DR. ROBERT GLASER 
LRDC 

UNIVERSITY OF PITTSBURGH 
3939 O'HARA STREET 
PITTSBURGH, PA 15213 

Or . Ross Greene 
CTR/McGraw Hill 
D»*l Monte Research Park 
Monterey, rft 93940 

Or. Alan Gross 

Center for Advanced Study in Education 
City University of New 'fork 
Hew York, f'Y IOO36 

Dr. Ron Hambleton 
School of Fducation 
"nwersity of Massechusetts 
Amherst , MA 01002 



Dr. Oieiter Htrrls 1 

School of Wucation 
University of California 
Santa Barbara. CA 93106 

1 

Dr . Lloyd Humphreys 
DapartAant of Psychology 
University of Illinois 
Champaign, !L M820 

Library 

Hum MO/ Was tern Division 
27857 Berwick Drive 

rerrael. CA 9^921 1 



Dr. Steven Hunk* 
Department of Education 
University of Alberta 

Edmonton, Alberta 1 
CANADA 

D". Earl Hunt 
Dept. of Psychology 

University of Washington 1 
Seattle. WA 9% 100 

Dr • Huynh Huynh 
Department of Education 
University of South Carolina 
Columbia, SC 29203 1 

Dr. Cirl J. Jensema 
Gailaudet College 

Kendall Green j 
Washington, DC 2Q002 

D*-. Arnold F. K^naric* 

Honeywell . Inc. ^ 

26^0 RUgeway Pkwy 

Minneapolis, MN 55 1 * 1 ? 

Dr . John A. Keats ^ 
University of Newcastle 
Newcastle, Flew South Wales 
AUSTRALIA 

Mr . Marl in Kroger 

11 17 Vn Goleta 1 
Palos V^rUes Estates, CA 9027U 

LCOL. C.R.J. LAFLEUR 

PZR^INNEL APPLIED RESEARCH 

NAT! \AL DEFENSE HQS 1 

101 : Li'NEL BY DRIVE 

LTTA* A, CANADA K1A 0<2 

Dr. Mich iv 1 Levine 

Department of Educational Psychology 1 
University of Illinois 
Champaign, IL 61*2n 

Faculty it Soriile Wetensch ippcn 

fil jksuniversiteit SroninRen 

i\ide HoterinAestraat 1 

*r onln^en 

NETHERLANDS 

Dr. Robert Linn 

College of Education 

University of Illinois 1 

Urbjna, IL 61^01 

Dr . Frederick H. Lord 
c 1ucatlnnat Testing Service 
Princeton, NJ r^n"* ^ 

Or . Robert R. Mackie 

Human Factors Research, Inc. 

67SO Cortona Drive 

Santa Barbara Research Pk . 

Toleto, CA <P*M7 . 



Dr . Gary Harco ' 
Educational Testing Service 
Princeton, NJ 03H50 

Dr . Scott Maxwell 

Department of Psychology 1 
University of Houston 
Houston. TX 77025 

Dr . Sam tayo 1 
Loyola University of Chicago 
Chicago, IL 60601 

Dr . Allen Munro 
Univ. of So. California 
Dehavioral Technology Labs 
3717 South Hope Street 
Los Angeles, CA 90007 

Dr. Melvln R. Novlck 1 
Iowa Testing Programs 
Un lverslty of Iowa 
Iowa City, IA 522U2 

Dr . Jesse Or lonsky 1 
Institute for Defense Analysis 

Army fiavy Drive 
Arlington, VI 2??n2 

Dr . James A. Paulson 1 
Portland State University 
P.O. Box 751 
Portland, OR 97207 

MR. LUISI PETRULLO 

2U;1 N. EDGEiiOOD STREET J 
ARLINGTON, VA 22207 

PR. STEVEN M. PlNF 
H750 Douglas Avenue 
Golden Valley, Mh 55*416 

1 

DR. DIANE M. RAMSEY-KLEE 
R-K RESEARCH h SYSTEM DESIGN 
W ( RIDGGMCNT DRIVE 
MALIBU, CA 90265 

'UN. RET. M. RAUCr 
P II U 

BUNDESMINISTERIUM HER VERTEIDIG'JNG 
POSTFACH 161 

53 BONN 1. GERMANY 1 

Dr . Peter D. Read 

Social Science Research Council 

605 Third Avtnue 

New York, NY 10016 

Dr . Mark D. Reckase 
Educational Psychology Dept. 
University of Missouri-Columbia 
12 Hill Hall 
Columbia, MO 65201 

Dr. F r ed Re if 
SESAME 

c/o Physics Department 

University of California 

Rerkely, CA 9172D 1 

Dr Andrpw M. Rose 
American Institutes for Research 
1055 Thomas Jefferson St . Kl 
Washington, DC 20007 

1 

Dr. Leonard L. Rosenbaun, Chairmar 
Department of Psychology 
Montgomery College 
Roekvillr, MD 20850 



Dr. Brnst Z. Rothkopf 
Pell Laboratories 

500 Mountain Avenue 
Murray Hill, NJ 0797M 

i>r . Donald Rubin 
Educational Testing Service 
Princeton, NJ 03*450 

Dr. Larry Rudner 
Gallaudet College 
Kendall Green 
Washington, DC 2000? 

Dr. J. Ryan 

Department of Education 
University of South Carolina 
Columbia, SC 29208 

PROF. FUMIKO SAMEJIMA 
DEPT. OF PSYCHOLOGY 
UNIVERSITY OF TENNESSEE 
KN0XVILLE, th 37916 

DR. ROBERT J. SEIDEL 
INSTRUCTIONAL TECHNOLOGY GROUP 

HUMRRO 
50O N. WASHINGTON ST. 
ALEXANDRIA, VA 2231U 

Or. Kazao Shigemasu 

University of Tohoku 

Department of Educational Psychology 

Kawauchi, Senddi 982 

JAPAN 

Dr. Edwin Shir key 
Department of Psychology 
Florida Technological University 
Orl'.ido, FL 32316 

D,' . Robert 3nith 

Department of Computer Science 

Rutgers University 

New Brunswick, NJ 0*90j 

Dr. Richard Snow 
School of Education 
Stanford Uh iversity 
Stanford, CA 9U305 

Dr. Robert Sternberg 

Dept. of Psychology 

Ydle University 

[*ox 1 1A, Yale Station 

New Haven, CT 06520 , 

OS. ALBERT STFVENS 
dOLT BERANEK & NEWMAN, VC. 
50 MOULTON STREET 
CAMBRIDGE, MA 02133 

DR. PATRICK SUPPES 

INSTITUTE FOR MATHEMATICAL STUDIES IN 

THE SOCIAL SCIENCES 
STANFORD UNIVERSITY 
STANFORD, CA 9^305 

Dr. Hariharan Swaminathctn 
Laboratory of Psychometric and 

Evaluation Research 
School of Educdtion 
University of Massachusetts 
Anherst. MA 01003 

Dr . Brad Sympson 

O'Tict of Data Analysis Research 
Educational Testing Service 
Princeton, NJ 035U1 



47 



1 Or. Klkuml Titsuoka 

Computer Rased Education Research 

Laboratory 
252 Engineering Research Laboratory 
University of Illinois 
Urb*»a, IL 01801 



Dr. Maurice Tatsuoka 

Department of Eiucationnl Psychology 

University ot m mois 

Champaign, IL G18C1 

Dr . Dbv id Thlssen 
Depjrtnent of Psychology 
University »f Kansas 
Lawrence, Ko .^i* 

Or, Hobert Tsutakawa 
Dept. of Statistics 
University of Missouri 
Columbia, MO bb2Q} 



1 Dr. J, Uh Liner 

Perceptronics, Inr . 
0271 Varlcl Avenue 
Wooiii and Hills, CA 91 

1 Dr. Ifowjni Ualner 

Purcau or Social science Research 

iw n street , n. 4. 

Washington, DC /noy^ 

1 DR. THOM/T. « ALLS TEN 

PSYCHOMETRIC LABORATORY 
DAVIF HALL -U 3A 
I'MVF P3ITY OF NORTH CAROL 
CfAPEL HILL, NC 2751*1 



1 Or . John Kmrjous 

Dep<»rtnent of itan<j,temenr 
Michigan University 
East Lansing, MI 



1 rtr. Phyllis We-vcr 

Graduate School of Education 
Harvard University 
200 L*rsen fell, Applan Way 
Cambridge, KA 02131 

1 DP. SU54N F, WHITFLY 
^YCHOLOGY DEPARTMENT 
U'UVFRSITY OF KANSAS 
LAUPr»»CE, KANSAS rrr*im 



Dr. *>lfa«jng Wildgrube 

. e tr«jitKr<jefteamt 

Rosenberg 5_{0') 

Bonn, *e3t Germany D-5'Or 

Dr . Robert l<ou«1 * 
School Examination Depirtnent 
'•nwersity of LonfoM 

'fuwer C*r«.«et 
London „c:E 6EP 

. k<jrl arm 
Center for reseurch on Learning 

..nil T*» aching 
University ot Michigan 
Ann Arbor, HI H^UOH 



Previous Publications 



Procaedlnge of the 1977 Computerised Adeptlve Tatting Conference. July J 978. 

Research Reports 

79-4. Effect of Polnt-ln-Tlme In Instruction on the Meeeurement of Achievement. 
Auguet 1979. 

79-3. Relet lonehlpe among Achievement Level Eetlmatea from Three Item Characteristic Curve 
v Scoring Kathode. April 1979. 

Final Report: files-Free Computerised Tenting. March 1979. (NTIS No. AD A068176) 
79*2. Effects of Computerised Adept lve Teetlng on Bleck end White Studente. March 1979. 

(NTIS No. AD A067928) 

79-1. Computer Prd-grama for Scoring Teet Data with Item Cherecterletlc Curve Models. 

February 1979. (NTIS No AD A067732) 
78-5. An Item Bias Invsstlgstlon of s Stsndsrdlssd Aptitude Teet. December 1978. (NTIS 

No. AD A064352) 

78-4. A Conetruct Velldetlon of Adeptlve Achievement Teetlng. November 1978, 

78-3. A Comparleon of Levele and Dimensions of Performance In Bleck and White Croupe on 

Taata of Vocabulary, Me t heme t lea, end Spatial Ability. October 1978. (NTIS No. 

AD A062797) 

78-2. The Effecte of Knowledge of Reeulte and Teet Difficulty on Ability Test Performance 

end Peychologlcal Reectlone to Teetlng. September 1978. 
78-1. A Comperiaon of the Felmeee of Adeptlve and Conventional Teetlng Streteglee. August 

1978. (NTIS No. AD A039436) 
7 7-7. An Information Comparleon of Convent lonel end Adeptlve Tests In the Measurement of 

Cleeeroom Achievement. October 1977. (NTIS No. AD A047495) 
77-6. An Adaptive Teetlng Strategy for Achievement Teet Batteries. October 1977. (NTIS 

No. AD A046062) 

77-5. Calibration of an Item Pool for the Adeptlve Maaeurement of Achievement. September 

1977. (NTIS No. AD A044828) 
77-4. A Rapid Item-Search Procedure for Beyeelan Adaptive Teetlng. Mey 1977. (NTIS No. 

AD A041090) 

77-3. Accuracy of Perceived Teet-Item Difficult lee. May 1977. (NTIS No. AD A041084) 
77-2. A Comparleon of Information Functlona of Multiple-Choice and Frae-Reeponee Vocebulary 
Iteme. April 1977. 

77-1. Appllcetlone of Computerised Adeptlve Teetlng. Merch 1977. (NTIS No, AD A038114) 
Final Report: Computerised Ability Teetlng, 1972-1975. April 1976. (NTIS No. 
AD A024516) 

76-3. Effecte of Item Characterlet Ice on Teet Felmeee. December 1976. (NTIS No. 
AD A035393) 

76-4. Peychologlcel Effecte of Immedlete Knowledge of Reeulte end Adaptive Ability Teetlng. 
June 1976. (NTtS No. AD A02 7170) 

76-3. Effecte of Immedlete Knowledge of Reeulte end Adeptlve Teetlng on Ability Teet Per- 
formance. June 1976. (NTIS No. AD A028147) 

76-2. Effecte of Time Llmlte on Teet-Teklng Behavior. April 1976. (NTIS No. AD A024422) 

76-1. Some Propertlee of a Bayeelen Adeptlve Ability Teetlng Stretegy. Merch 1976. (NTIS 
No. AD AO 22964) 

75-6. A Simulation Study of Stradaptlva Ability Teetlng. December 1975. (NTIS No. 
AD A020961) 

75-5. Computerised Adaptive Trait Meewuretwnt: Probleme end Proepecte. November 1975. 
(NTIS No. AD A018675) 

75-4* A Study of Computer-Admlnleterod Stradaptlve Ability Teetlng. October 1975. (NTIS 
No. AD A018758) 

75-3. Empirical and Simulation Studlae of Flexllevel Ability Teetlng. July 1975, (NTIS 
No. AD A013185) 

75-2. TETREST: A FORTRAN IV Progrem for Calculating Tatrechorlc Correlet lone. Merch 1975. 
(NTIS No. AD A007572) 

75-1. An Empirical Comparleon of Two-Stege end Pyremldel Adaptive Ability Testing. Februery 

1975. (NTIS No. AD A006733). 
74-5. Strataglea of Adaptive Ability Meeeurenent. December 1974. (NTIS No. AD A004270) 
74-4. Slmuletlon Studlee of Two-Stage Ability Teetlng. October 1974. (NTIS No. AD A001230) 
74-3. An Empirical Invaetlgetlon of Computer-Admlnletered Pyramidal Ability Teetlng. July 

1974. (NTIS No. AD 783553) 
74-2. A Word Knowledge Item Pool for Adaptive Ability Maaeurement. June 1974, (NTIS No. 

AD 781894) 

74-1. A Computer Softwere Syatsm fo;- Adaptive Ability Meeeurement. Jenuary 1974. (NTIS 
No. AD 773961) 

73-3. The Stretlfled Adeptlve Computerlssd Ability Tsst, Ssptsmbsr 1973. (NTIS No. 
AD 768376) 

73-2. Comparleon of Four Emplrlcel Item Scoring Proceduree. Auguet 1973. 

73-1. Ability Maaeurement) Conventlonel or Adaptive? February 1973. (NTIS No. AD 737788) 

AD Numbw an tho9$ a*Bign*d by the At/enee Dooumntation Center^ 
for rttrUval through the National Uohniaal InformHon S*>rvttw. 
Coplee of theee reporte are available, while euppllee leet, from: 
Peychometrlc Methods Program, Department of Peychology 
O . N660 Elliott Hell, Itoivereity of Mlnneeote 

ERIC 75 E * ,t * iv,r Road » Mlnneapolle, Mlnneeota 55453 

40 



